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@ DNA sequences of the EBV genome, recombinant DNA molecules, processes for producing EBV-related antigens, 
diagnostic compositions and pharmaceutical compositions containing said antigens. 

DNA sequences of the EBV genome, recombinant DNA Mappm? of mwiA's relative to the ebv b95-« ge nome 

molecules, processes for producing EBV-related antigens, di- 
agnostic compositions and pharmaceutical compositions con- — _1_ 
taining said antigens. 

Described are DNA sequences of the EBV genome cod- 
ing for EBV-related antigens, recombinant DNA molecules 
containing said DNA sequences, vector/host systems for clon- 
ing and expression of said DNA sequences, EBV-related anti- 
gens and methods for their preparation, diagnostic and phar- 
maceutical compositions containing said DNA sequences and 
antigens respectively (Fig. 2). 
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Hans Joachim Wolf August 22, 1985 

Josef JSgerhuber Str. 9 
8130 Starnberg 

DNA sequences of the E3V genome, recombinant DNA molecules, 
processes for producing EBV-related antigens, dia- 
gnostic compositions and pharmaceutical compositions 
containing said antigens 

Technical field of invention 



This invention relates to DNA sequences of the EBV- genome 
coding at least for parts of EBV-related antigens to be used in 
25 methods and diagnostic and pharmaceutical carpositions referred to 
below and methods of localising and isolating at least part of the 

DNA 



Furthermore the invention relates to recombinant DNA 
molecules i,e. cloning and expression vectors useful for 
the production of antigenic determinants of said EBV- 
related antigens after introduction of these vectors 
into appropriate hosts such as bacteria, yeasts and 
mammalian cells. 
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Finally this invention relates to methods and compositions 
or kits, respectively, for a rapid, simple, highly sensitive 
and highly specific determination of antibodies directed 
to EBV-related antigens. In these tests different antigens 
of EBV are used to detect specific antibody classes in 
the patient'* serum, directed to these antigens. This detec- 
tion allows fairly reliable conclusions as to the status 
of infection of the serum donor such as preinf ection, fresh 
infection, chronic infection, convalescence and neoplastic 
condition. Furthermore, this invention relates to pharma- 
ceutical compositions, e.g. vaccines containing said 
antigens useful for prophylaxis and therapy of EBV-related 
diseases. 

Background Art 

The herpesviruses (Herpetoviridiae) are enveloped ico- 
sahedral capsids with an overall diameter of 15o nm. The 
viral genome consists of * a double-stranded DNA with 

a molecular weight of approximately 1o D. Human herpes- 
viruses are Herpes simplex I ("fever blisters") , Herpes 
simplex II (genital herpes) , Varicella-Zoster (chickenpox, 
shingles) , Cytomegalovirus (congenital abnormalities, e.g. 
microcephaly) , and Epstein-Barr virus (EBV) (infectious 
mononucleosis (IM), Burkltt ' s lymphoma (BL) , nasopharyngeal 
carcinoma (NPC) 

Herpesviruses display a remarkable propensity for establish 
ing latent infections which may persist for the life of 
the host. After the primary infection the virus may remain 
quiescent, being demonstrable only sporadically or not 

at all, until it is reactivated by one of several known 
types of stimulus, such as irradiation or immunosuppression 
Such exacerbations of endogenous disease may take the form 
of a crop of vesicles on the skin in the case of herpes 
simplex or zoster, or more generalized effects in the 
case of cytomegalovirus or EBV. The capacity to persist 
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1 indefinitely as a Latent infection enables these viruses 
to survive in nature for a long time. During the last 
years, attention has turned to the correlation of human 
cancer and EBV. 

5 

Spstein-3arr- Virus (EBV) , infections and their conse- 
auences 



EBV causes infectious mononucleosis as a primary disease. 
10 Predominantly it affects children or young adults. More 
than 9o % of the average adult population is infected by 
EBV that persists for lifetime in peripheral B -lymphocytes . 
The virus is lifelong produced, in the parotid gland and spread 
via the oral route. 

15 

Serology suggests that EBV might be involved, in causing 
two neoplastic diseases of man, African Burkitt's lymphoma 
(BL) and nasopharyngeal carcinoma (NPC) . Infectious mono- 
nucleosis is a consequence of primary infection by EBV. 
20 it is not a life— threatening disease if additional risk 
factors are absent. 



However, the subjective feeling of sickness, frequently 
for extended periods (in the order of several weeks), and 
the necessity to avoid physical stress due to the drastical- 
ly increased risk of splenic rupture would certainly 
suggest a control of this disease. 



The clinical diagnosis of infectious mononucleosis is 
usually derived from a combination of the following para- 
meters : 

1. High leukocyte count ranging from 1o,ooo to 2o,ooo and 
reaching up to 5o,ooo 

2. To % atypical cells 

3 . lymphadenitis 

4 . fever . 
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1 Patients with infectious mononucleosis 

shed E3V in their saliva. Virus shedding does not require 
special prevention against spreading the disease as epi- 
demics and infection of persons in close contact are rare 
5 (A.S. Evans, "The transmission of SB viral infections. 
Viral Infections in Oral Medicine.", edited by J. Hooks, 
G. Jordan, Elsevier North Holland Amsterdam, p. 211 
(1982)). Virus shedding does not stop with recovery from 
disease and at least 60 % (possibly up to loo %) of the 
10 adult population shed at least low levels of EBV which 
is produced lifelong in epithelial cells of the salivary 
duct of the parotid gland (H. Wolf, 51. Haus, E. Wilmes, 
"Persistence of Epstein-Barr virus in the parotid gland" , 
J. Virol. 51 (1984) ) . 



About 1 % of the infectious mononucleosis cases show 
complications either already at the onset of the disease 
or as a late consequence. Most complications are due to 
autoimmune mechanisms and are in some cases indiscernable 
from graft versus host disease, a mechanism by which 
the body might clear itself from the excess of EBV con- 
verted proliferating B-cells. 

If the T-cell response is insufficient, e.g. due to 
circumstances like treatment with high doses of Cyclo- 
sporin A in combination with corticosteroids or due to 
AIDS or a certain genetic predisposition as described by 
Purtilo (Duncan's syndrome, X-chromosome-linked lympho- 
proliferative disease (XLP) ; D.T. Purtilo, K. Sakamoto, 
V. Barnabei, J\ Seeley, T. Bechtolg, G. Rogers, J. Yetz, 
S. Harada and the XLP-collaborators : "Epstein-Barr virus- 
induced diseases in boys with the X- Linked lympho-pro- 
iiferative syndrome (XL?) . Update on studies of the 
registry." Am- J. Med. 73, p. 49 ( 1 982) ) t infected 
B-cells may have a chance to escape from host control and 
grow without limitation as they would do when being 
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1 cultivated in vitro. The consequences have been described 
as BL-like disease in cases of AIDS patients (J.L. Ziegler, 
R.C. Miner, E. Rosenbaum, E.T. Lennette, E . Shillitoe, 
C. Casavant, W.L. Drew, L. Mintz, J. Gershor, J. Green- 

5 span, J. Beckstead, K. Yamamoto , "Outbreak of Burkitt's- 
iike iymphonia in homosexual men.". Lancet 2, p. 631 (1982)) 
or as a polyclonal lympho-prolif erative disease for XL?" 
patients (D.T. Purtiio et al., supra) or kidney transplant 
recipients (D.W. Hanto, G. Frizzera, D.T. Purtiio, K. 
10 Sakamoto, J.L. Sullivan, A. K . Saemundsen, G. Klein, R-L. 
Simmons, J.S. Najarian, "Clinical spectrum of lympho-pro- 
liferative disorders in renal transplant recipients and 
evidence for the role of Epstein-Barr virus.". Cancer 
Res. 41, p. 42S3 (1981)). 

15 

The positive and fast identification of infectious mono- 
nucleosis or acute EBV infection is especially important 
in cases where a differential diagnosis to leukemia or, 
in case of transplant recipient, to graft rejection crisis 
20 is necessary. In these cases, a false diagnosis may lead 
to incorrect therapy, which may have serious, even life- 
threatening effects . 

25 Prevention of primary disease caused by EBV 

Infectious mononucleosis seems to be unknown in areas like 
the .Philippines or Malaysia (D.S.K. Tan, "Absence of 
infectious mononucleosis among Asians in Malaya.", 

30 Med. J. Malaya 21, p. 358 (1967)) where infection by E3V 
occurs very early in life. Almost the whole population 
has antibodies at the age of 2-1o years ac the latest. 
Clinical symptoms seem to be a consequence of juvenile 
or adult infection. It can be assumed that a vaccine- 

35 primed organism will be infected without significant 

clinical symptoms and that the consequences often fatal 
in the risk groups listed above could be eliminated by 
a vaccine. 
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1 Burkitt 1 s Lymphoma and E3V 

The development of Burkitt 1 s lymphoma is linked to 
chromosomal rearrangements- Not all cases contain E3V 
5 genomes in the tumor ceils. However, at least in areas 
with high incidence, 97 % of these neoplasias are 
E3V-related and a control of EBV infection is likely 
to reduce the risk of developing Burkitt' s lymphoma. 

10 Nasopharyngeal carcinoma as a possible "secondary 
disease" related to EBV 

The other disease where EBV shows a loo I association 
is nasopharyngeal carcinoma (NPC) ("The Biology of 

!5 Nasopharyngeal Carcinoma", UICC technical report series, 
vol. 71, edited fay M.J. Simons and K . Shanmugaratnam, 
International Union Against Cancer, Geneva, p. 1 (1982)). 
NPC most frequently starts at the fossa of Rosenmueller 
(Recessus pharyngeus) at the postnasal space. Frequently 

20 - patients are hospitalized only after the first typical 
metastases have developed in the cervical lymph nodes - 



In some areas of Southern China and amongst Chinese in 
25 sin( ? a P° re and Malaysia, NPC is the most frequent neoplasia 
of man with an incidence of up to 4o per 1oo,ooo per year. 
In other parts of the world, like Borneo or Tunesia the 
incidence is also high. In most other areas, the incidence 
is around o,2 per 1oo,ooo per year which represents about 
30 4 % of ear ' nose and throat (ENT) -tumors. The age dis- 
tribution shows a clear single peak around the age of 4o 
to 5o in almost all high-risk areas. In 3orneo and to some 
extent in Tunesia, a remarkable second peak has, however, 
been observed at an early age ranging from 5 to 15 years 
35 (M. J. Simons et al. , supra) . 
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1 Environmental factors including traditional Chinese 
medicine may be responsible for the increased risk of 
nasopharyngeal carcinoma in certain, predominantly 
Chinese, populations of Southern Asia (H. Wolf, "Biology 

5 of £pstein-3arr virus in: H Immune deficiency and cancer: 
Epstein-3arr virus and lymphoprolif erative malignancies", 
ed. D . Purtilo, Plenum Press, p. 233 (1984))* 



Control of E3V-related neoolasia 
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There are three possible basic strategies to control 
neoplasia: 

1 . Early detection followed by therapy, 

2. delay of onset of disease ideally beyond the average 
15 lifespan, and 

3. prevention. 

These goals may be achieved also in multifactorial 
diseases such as many neoplasias. Incidence of disease 
may be reduced by eliminating one or more of the essential 
factors which are not necessarily sufficient by them- 
selves to cause the disease, or by reducing factors which 
promote the manifestation of neoplastic conditions. 
The use of the specific virus-related antigens of this 
invention, or antibodies or genetic materials as tools 
for early diagnosis of virus-related tumors, might 
facilitate the elimination of essential factors. 



30 



Selection o£ EBV-related gene products for diagnosis of 
EBV-related NPC 



35 



A. Primary infection with EBV: Development of antibodies 
against VCA (viral capsid antigen) , EA (early antigen) 
and EBNA (Epstein-Barr Nuclear Antigen) 
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1 EBV infects B-lymphocytes during acute or primary in- 

fection (mononucleosis) . Due to the lack of immune res- 
ponse, a number of cells enter into the lytic cycle 
and produce a full set of viral antigens which are shed 

5 into the blood stream during cytolysis. Against these 

antigens, specific antibodies will be synthesized by 
the host's immune system (Table A). 

Probably not all B-lymphocytes are capable of supporting 
10 a fully lytic infection due to a cellular factor which 

prevents expression of EBV. These cells are latently 
carrying EBV genomes for the rest of the host's life. 

Table A 

15 
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Disease 


VGA: 


IGG 


IGM [GA 


EA 


EBNA 


MA' 


Normal adults 


+ 




* 


* + 


+ 


Acute adults 












(EARLY) 






+ 






- 1 


Chronic infection 


+ 








+7 


REACTIVATION 






+ 




+ 


XLP* 










(+) 


0 

» 


NPC 




++ 


+ 


+<D) 


+ 




BL 




++ 




+ <R) 




i 



2 XLP AS AN EXAMPLE OF IMMUNOLOGICALLY DEPRIVED HOSTS 

30 1 Determined by immunoprecipitatioh of SP 2^0/200 

(MA; membrane antigen) 

B. Convalescence; Disappearance of antibodies against 

EA and maintenance of antibodies against VCA and SBNA 

35 

As the immune defense mechanisms of the body remove the 
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lytically infected cells from the circulation, the anti- 
body levels will start to fall during the convalescent 
phase. After a certain period, anti-EA-antibodies dis- 
appear. However, as mentioned above, EBV is produced in 
the parotid gland. The viral particles and intracellular 
virus-associated antigens including EA will be shed into 
the saliva and reach the oropharynx. Here the viral 
particles bind to the B-lymphocytes and are presented 
to the body as antigens, thus the antibody titer against VCA 
is maintained- Since EA cannot bind to the lymphocytes 
it will be degraded by poteases and therefore will not 
be available to the immune system as an antibody- inducing 
antigen . 



The circulating lymphocytes that are latently infected 
by EBV contain^ EBNA. At the end of their life cycle these 
cells disintegrate and release *EBNA into the blood stream. 
Therefore antibodies to this antigen will persist. 

Thus, due to the EBV-production in the parotid gland and 
to the release of EBNA from latently infected B-cells, 
sera of convalescents will have low anti-VCA and anti- 
EBNA IgG-antibody levels (see Table A, supra) . 
In addition EA released frcm rare B lymphocytes which may enter a 
lytic cycle may be an inferior antigen and may not give rise to anti- 
body levels detectable with the test systems used. 



In combination with the known sequence of appearance of 
antibody classes, specifically the early presence of IgM 
antibodies followed by IgG antibodies, the various anti- 
gen classes of primary disease caused by EBV can be 
utilized for improved diagnostic procedures. However, 
available test systems which are mainly based on cellular 
antigens or cell derived antigens have serious limitations. 

'mis concerns cne sensitivity, especially for detection of 
IgM antibodies and also unspecific reactions. 
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1 C. EBV-related antibodies in individuals suffering of NPC; 

The first suggestive evidence that Epstein-3arr virus 
might be causally related to nasopharyngeal carcinoma 
and African Burkitt's Lymphoma was derived from sero- 
logical data (for review see M.A. Epstein, 3.G. Achong, 
"The Epstein-3arr Virus" Springer Verlag Berlin, Heidel- 
berg, New York (1979)). 

Using mainly indirect immunofluorescence on cells pro- 
10 ducing virus or at least early viral antigens, signi- 

ficantly higher antibody titers to these antigens 
were found in patients' sera. These first tests which 
detected unspecified immunoglobulin classes against 
a group of proteins named Early Antigen (EA) and an- 
l 5 other group of proteins named Virus Capsid Antigens 

(VCA) were helpful for the establishment of a re- 
lationship between EBV and these diseases. These cests, 
however, are of limited value for definite diagnosis 
of the malignancies from a single serum, and cannot 
20 . be used for monitoring therapy. 

The introduction of antigen and antibody class specific 
tests, specif ically the determination of peripheral IgA 
antibodies for the two antigen families EA and VCA and 

25 also the first attempts to subdivide at least the EA- 
family (EA, D or R; G- Henle, W. Henle and G. Klein, 
"Demonstration of 2 distinct components in the early anti- 
gen complex of Epstein-3arr Virus infected cells" ,Int, 
J. Cancer 8, p-. 272- (1971)) achieved remarkable iirrarove- 

30 ments of the diagnostic and prognostic value of the tests . 

In the areas of high risk for NPC, 1% of the adult popu- 
lation has IgA antibodies for EBV-Capsid antigen (VCA) . 



35 
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1 Three percent of this group has NPC upon clinical examination 
and, with the exception of terminal cases, there were no 
anti-VCA IgA negative cases detected. Out of the IgA 
anti-VCA positives, about 1% per year developed NPC in 
5 a 3 ye^r follow up. A test of this quality, if available 
as a highly specific automat- readable ELISA test, would 
provide an excellent "first step" screening for a popu- 
lation of extreme risk. 

Detection of EB virus IgA/VCA antibody is helpful for 
diagnosis of NPC {see table on page 14) , and of special value for the 
detection of early stages. For example, in Wuzhou City (China; 
high risk area for NPC) , the frequency of NPC detected 
by serological mass survey revealed a much higher per- 
centage of patients in stages I (42%) and II (48%) than 
otherwise detected in outpatient clinics (1.7% stage I 
and 3o % stage II) . The chance of survival is clearly 
related to the stage at which therapy is begun. The sur- 
vival rates for stage I are (according to Shanghai Tumor 
Hospital) 93%, for stage II 75%, and are very low for 
more advanced stages. Therefore it is possible to reduce 
the mortality rate of NPC through early detection and 
early treatment. 

IgA antibodies to the early antigen complex of EBV can be 
detected in 40% to 70 % of NPC patients, depending on the 
method used. These antibodies are virtually absent in the 
non-tumorbearing population. Such test of the tumorbearing 
individuals should be of great importance for the decision 
to start therapy, and its value would be even higher if the 
sensitivity could be enhanced to allow detection of disease 
in closer to 100 % of the tumor patients. 

The detection rate of NPC among IgA/VCA antibody-positive 
individuals is 1.9 % and that of IgA/EA individuals is 
30-40 %. These data indicate that the IgA/EA antibody 
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test is more specific for the detection of NPC, but not as 
sensitive as IgA/VCA antibody, 

A number of laboratories have used the continuous deter- 
mination of IgA antibodies to EA and VCA to monitor the 
success of therapy and for early detection of relapse 
with very good success - 

Membrane protein gp 250/350 and its use 



Four proteins of the viral envelope constituting the so- 
called membrane antigen complex (MA) have been described 
(L.F. Qualtiere, G.R. Pearson, at. al., supra; J. North, 
A.J. Morgan, M.A. Epstein, "Observations on the EB virus 
envelope and virus -determined membrane antigen (MA) poly- 
peptides", Int. J. Cancer 26, p. 231 (1980)). awo of these 
proteins, i-e. gp 250 and gp 350 , are antigenically closely re- 
lated (D.A. Thorley-Lawson and K . Geilinger, "Mono- 
clonal antibodies against the major glycoprotein (gp 
250/350) of Epstein-3arr virus neutralize inf ectivity" , 
Proc. Natl. Acad. Sci, USA 77, p. 5307 (1980)). The 
molecular weight of one component ranges from 200,000 
to 250,000 D depending on the cell line where the virus 
is derived from and the second antigenetically related 
glycoprotein has a molecular weight of 300,000-350,000 D 
but is absent in some cell lines. Since these glyco- 
proteins are all related in antigenicity, procein and 
encoding DNA sequence, they are usually referred to 
as gp 220/350 or gp 250/350 or simply as gp 250 or gp 350 
but meaning the whole family of related glycoproteins. 

Glycoprotein 250/350 is able to bind to the E3V receotor 
of human and some primate B-lymphocytes and to thus 
initiate the infection of these cells (A. Wells, N. Koide, 
G. Klein, "Two large virion envelope glycoproteins mediate 
Epstein-Barr virus binding to receptor-positive cells", 
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J. Virol, 41, p. 286 ( 1 982) ). Antibodies against these 
proteins neutralize the infectivity of the virus, which 
could be demonstrated for human as well as for rabbit 
antisera and mouse monoclonal antibodies (D.A. Thorley- 
Lawson et. al., supra). By the use of monoclonal anti- 
bodies it has been shown that blocking of only one anti- 
genic determinant present both in gp 350 and gp 250 was 
sufficient for virus neutralization. Adsorption of human 
sera to immobilized gp 350 and gp 250 removed the 
neutralizing antibodies (D.A. Thorley-Lawson et. al . , 
supra) . Thus, there is convincing evidence that 

a) gp 350 and gp 250 induce the production of neutra- 
lizing antibodies, and that 

b) antibodies against gp 350 and gp 250 have neutralizing 
capacity . 

Therefore, this protein as well as its related viral gene 
product, gp 350 (with a molecular weight of 350,000), are 
candidates for a possible EBV vaccine (A.J. Morgan, M.A. 
Epstein, J.R, North, "Comparative immunogenic ity studies on 

Epstein-Barr virus membrane antigen (MA) gp 34o with 
novel adjuvants inmice, rabbits and cotton-top tamarins", 
J. Med. Virol. 13, p. 281 (1984)). These glycoproteins 
are expressed on induced EBV producer cell lines and 
can be easily demonstrated after radioiodination of 
cell surface proteins (L. P. Qualtiere, G.R. Pearson r 
"Epstein-Barr virus-induced membrane antigens: immuno- 
chemical characterization of Triton X-ioo solubilized 

viral membrane antigens from EBV super infected Raji cells", 
Int. J. Cancer 23, p. 808 (1979)). 
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Application of gp 2 So/ 3 So for the diagnosis of EBV-related 
diseases 

IgG antibodies are absent during the acute phase of primary 
EBV infection, but present for lifetime after convalescence 
IgM antibodies are present in the early stage of the 
disease and absent during convalescence. 



10 



IgA antibodies against EBV-antigens are present almost 
exclusively in NPC patients and can be detected in sera 
of .at least 58 % of these patients even; with not very 
sensitive tests (Zeng Yi and Hans Wolf, manuscript in pre- 
paration and example 16, infra). 

Comparison of Positive Rate of IgG and IgA Antibodies 
to VCA and MA from NPC Patient3 and Normal Individuals 



Cases 



MA/IgG 



MA/lgA 



VCA/lgA 



SA/lgA 




NPC 

patients 



48 



48 



100 



28 58.3* 



48 



100 



31 



64-6 



Normal 
Indivi- 
duals 



48 



47 



97.9 



0 



* MA/IgG and MA/IgA detected by immunofluorescence test 
VCA/lgA and 2A/lgA detected by immunoenzymatic test 



The whole gp 25o molecule or parts of its backbone poly- 
peptide chain can be utilized as reagents in preferentially 
35 class-specific antibody detection tests such as passive 
hemagglutination, counter gel electrophoresis, radio- 
immunoassays or enzyme-linked immuno- absorbent assays. 



A 0.17-3254 

t Highly specific test antigens allow better signals and 
detect otherwise unrevealed low antibody levels of 
clinical significance. The use of singular antigenic 
sites of the gp 25o instead of the entire gene product 

5 may, in some cases, permit a more precise diagnosis 
of the disease. 

Application of gp 25o/3 5o for prophylaxis and treatment 
of EBV-related disease s 

10 

A. Since infection by EBV early in life only causes sub- 
clinical seroconversion, it may be anticipated that 
the presence of maternal antibodies or antibodies 
induced by a vaccine will influence the clinical 

15 manifestation of a primary EBV infection. It is 

expected that the vaccination of children or young 
adults, preferably before the peak of risk of catching 
an EBV infection, reduces effectively the clinical 
manifestation of infectious mononucleosis in the 

20 population. 

B. in all areas with high incidence rates for NPC or BL, 
the population shows almost 1oo % seroconversion to 
EBV within the first one to two years of life. 

25 Vaccination will have to take place soon after birth. 

If this vaccination is regularly repeated, it will 
in all probability prevent EBV infection, delay it 
or reduce the biological effects of early primary 
infection. Each of these consequences is expected 

30 to either prevent the subsequent development of 

neoplasia, to delay its onset considerably or to de- 
crease the relative risk. 



35 
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1 C. In NPC, occasional production of viral antigens at 
the site of the tumor will stimulate primarily IgA 
secreting B-lymphocytes. IgA antibodies are capable 
of blocking antibody-mediated cytotoxicity. IgA anti- 
bodies to viral membrane antigens, such as gp 25o, 
are present in NPC and BL patients and may not only 
be indicators of the disease, but may even contribute 
to the failure of the immune system to eliminate the 
tumor cells by their masking potential. Large doses 

10 of the purified antigen given to tumor patients may 

bind IgA and initiate the formation of an excess of 
IgG antibodies directed to the same antigen. These 
specific IgG antibodies may then compete with 
remaining IgA antibodies and allow the elimination 

1 ^ of tumor cells by antibody-dependent mechanisms. 

D. Appropriate administration of gp 25o or related pro- 
ducts might also enhance the cellular immune mechanisms 
and thus restrict the growth of tumors. 

20 

Production of EBV specific antigens according to the 
present invention 



1* As a consequence of all findings, it is one of the objects of this 
invention to improve the sensitivity of tests for detection of anti- 
body classes and antigen specific antibodies and to develop a system 
which allows mass testing and better standardization. 

2. EBV cannot be efficiently produced in a lytic cell 
cycle since efficiently infectable cells are not 
known at present and because all of the cells used 
as source for the preparation of EBV or related anti- 
gens are immortalized cells or even tumor derived 
cells. In most cell lines retroviruses have been 
demonstrated. The products isolated from such cultures 
therefore are not only very expensive but their use 
is also a potential safety risk. 
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3. The application of recombinant DNA technology has 
made possible the production of useful polypeptides 

by appropriate host cells transformed with recombinant 
DNA molecules and grown in appropriate culture systems. 

4. According to the present invention, recombinant DNA 
methods are used to express the genetic information 

of the genes or at least of parts of the genes encoding the EBV pro- 
teins pi 38, P 150 and gp 250/350 in appropriate host cells, such as 
bacteria (e.g. the genera Escherichia, Salmonella, Pseudcroonas ' 
or Bacillus), yeasts (e.g. the genera Candida, or 
Saccharomyces) and mammalian cells (e.g. Vero-cells, 
CHO-cells or lymphoblastoid cell lines). 

5 . Furthermore, the genomic regions encoding the EBV proteins 
p150, p143, p138, p110, p105, p90> p80 and p54 were identified and their 
relevance for diagnostic purpose has been identified. Therefore, the 
key information for the production of these proteins .or antigenic deter- 
minants thereof in a manner as dexnonstratad for the proteins p1 38, p150 
and gp 250/350 is also, disclosed in-.the present invention. 



Recombinant DNA technology 

A. Expression control systems 

Prokaryotes most frequently are represented by various 
strains of E. coli. However, other microbial strains 
may also be used, such as bacilli, for example 
Bacillus subtilis, various species of Pseudomonas, 
or other bacterial strains. In such prokaryotic 
systems, plasmid vectors which contain replication 
sites and control sequences derived from a species 
compatible with the host are used. For example > E. coli 
is typically transformed using derivatives of pBR322, 
a plasmid derived from an E. coli species by Bolivar, 
et al., Gene 2, p. 95 (1977). pBR322 contains genes for 
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1 ampicillin and tetracycline resistance, and these markers 
can be either retained or destroyed in constructing the 
desired vector* Commonly used prokaryotic control 
sequences which are defined herein to include pro- 
^ meters for transcription initation, optionally with an 
operator/ along with ribosome binding site sequences, 
include such commonly used promoters as the beta- 
lactamase (penicillinase) and lactose (lac) promoter 
systems (Chang, et. al. Nature 198, p. 1056 (1977)) 

10 and the tryptophan (trp) promoter system (Goeddel, et 

al.. Nucleic Acids Res. 8, p. 4057 (1980)). The lambda- 
derived promoter and N-gene ribosome binding site 
(Shimatake, et al. , Nature 292, p. 128 (1981), which 
has been made useful as a portable control cassette 

15 are further examples. However, any available promoter 
system compatible with prokaryotes can be used. 



In addition to bacteria, eukaryotic microbes, such 

as yeast, may also be used as hosts. Laboratory strains 

20 of Saccharomyces cerevisiae, Baker's yeast, are most 
used, although a number of other strains are commonly 
available. While vectors employing the 2 micron origin 
of replication are illustrated (J.R. Broach, Meth. 
Enz. 101, p. 307 (1983)), other plasmid vectors suit- 

25 able for yeast expression are known (see, for example, 

Stinchcomb et al. , Nature 282, p. 39 (1979); Tschempe 

at al., Gene 10, p. 157 (1980) and L. Clarke et al./, 

Meth. Enz. 101, p. 300 (1983)). Control sequences for 

yeast vectors include promoters for the synthesis of 
30 glycolytic enzymes (Hess et. al. , J . Adv. Enzyme Reg. 7, 

p. 149 (1963); Holland et al., Biochemistry 17, p. 4900 
(1978)). Additional promoters known in the art include 
the promoter for 3-phosphoglycerate kinase (Hitzeman 
et al., J. Biol. Chem. 255, p. 2073 (1980)), and those 
35 for other glycolytic enzymes, such as glyceraldehyde- 

3-phosphate dehydrogenase, hexokinase, pyruvate decarboxy- 
lase , phosphofrucnokinase , glucose-6-phosphate iso- 
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merase, 3-phosphoglycerate mutase, pyruvate kinase, 
tr iosephosphate isomerase , phosphoglucose isomerase , 
and glucokinasa. Other promoters, which have the 
additional advantage of transcription controlled by 
growth conditions are the promoter regions for alcohol 
dehydrogenase 2, isocytochrorae C, acid phosphatase, 
degracative enzymes associated with nitrogen metabolism, 
and enzymes responsible for maltose and galactose 
utilization (Holland, supra) . 



Evidence suggests that terminator sequences are de- 
sirable at the 3 1 end of the coding sequences. Such 
terminators are found in the 3 ' untranslated region 
following the coding sequences in yeast-derived genes, 
15 -Hany of the vectors illustrated contain control sequences 
derived from the enolase-I gene containing plasmid 
peno46 (M.J. Holland et al., J. Biol. Chem. 256, p. 1385 
(1981)) or the LEU 2 gene obtained from YEp13 (J. Broach 
et al., Gene 8, p. 121 (1979)), however, any vector 
containing a yeast- compatible promoter, origin of 
replication and other control sequences is suitable. 

It is also, of course, possible to express genes en- 
coding polypeptides in eukaryotic host cell cultures 
derived from multicellular organisms. See, for example, 
Cruz and Patterson, editors, "Tissue Cultures", Academic 
Press (1973). Useful host cell lines include VERO and 
HeLa ceils, and Chinese hamster ovary (CHO) ceils. 
Expression vectors for such cells ordinarily include 
promoters and control sequences compatible with mammalian 
ceils such as, for example, the commonly used earlv 
and late promoters from Simian Virus 40 (SV 40) (Fiers 
et al., Nature 2 73, p. 113 (1978)), or other viral 
promoters such as those derived from polyoma. Adeno- 
virus 2, bovine papiloma virus, or avian sarcoma virusses . 
General aspects of mammalian cell host svstem transfor- 
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nations have been described by Axel in U.S. Patent No. 
4 f 399 r 216 issued August 16, 1983. It now appears 
also that "enhancer" regions are important in optimizing 
expression; these are, generally , sequences found frequently up- 
stream of the promoter region. Origins of replication 
may be obtained, if needed, from viral sources. However, 
gene integration into the chromosome is a common 
mechanism for DNA replication in eukaryctes , and hence 
independent iv reolicating* vectors are not required. 
Plant cells are also now available as hosts , and 
control sequences compatible with plant ceils such 
as- the nopaline synthase promoter and polyadenyiation 
signal sequences (A. Depicker et.al., J - Mol. Appl. Gen. 1 
p. 561 (1982)) are available. 

Transformation of suitable hosts 

Depending on the host ceil used, transformation is dene 
using standard techniques appropriate to such cells. 
The calcium treatment employing calcium chloride, as 
described by S.N. Cohen, ?roc. Natl. Acad. Sci. (USA) 59, 
o. 2110 (1972) is used for prokaryotas or other ceils 
which contain substantial ceil wall barriers- Infection 
with Agrobacterium tumefaciens (C.H. Shaw et al., 
Gene 23, ?. 315 (1983)) is used for certain plant calls. 
For mammalian cells without such cell walls, the 
calcium phosphate precipitation method of Graham 
and van der Sb (Virology 52, p. 546 (1978)) is pre- 
ferred. Transformations into yeast are carried out 
according to the method of ?. van Solingen et al. 
(J. 3act. 130, p. 946 (1977)) and C. L- Hsiao et al. 
(Proc. Natl. Acad. Sci. (USA) 76, p. 3829 (1979)). 
Alternatively, the procedure of Kiebe , et al. l^ene 25, 
o. 333 (1983)) can be used. 
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c * Constru ction of recombinant cloning and expression 
vectors 
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Construction of suitable vectors containing the desired 
coding and control sequences employs standard ligation 
and restriction techniques which are well understood 
in the art. Isolated plasmids, DNA sequences, or 
synthesized oligodeoxyribonucleotides are cleaved, 
tailored and reiigated in the form desired. 

Site specific DNA cleavage is performed by treating 
with the suitable restriction enzyme (or enzymes) 
under conditions which are generally understood in 
the art, and the particulars of which are specified 
by the manufacturer of these commercially available 
restriction enzymes. See, e.g., New England Biolabs, 
Product Catalocr, 

If desired, size separation of the cleaved fragments 
may be performed by polyacrylamide gel or agarose gel 
electrophoresis using standard techniques. A general 
description of size separations is found in "Methods 
in Enzymology" 65, p. 499-560 (1980). 

Restriction cleaved fragments may be blunt ended 
by treating with the large fragment of E. coli DNA 
polymerase I (Klenow) in the presence of the four 
deoxynucleotide triphosphates (dNTPs) . The Klenow 
fragment fills in at 5 1 stiqky ends but chews back 
protruding 3' single strands, even though the four 
dNTPs are present. If desired, selective repair 
can be performed by supplying only a selected one 
or more dNTPs within the limitations dictated by the 
nature of the sticky ends. Treatment under appropriate 
conditions with S1 nuclease results in hydrolysis of 
any single- stranded portion. 
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Synthetic oligonucleotides may be prepared by the 
triester method of Matteucci et al . (J. Am. Chem. Soc. 103, 
p. 3185 CTS81I) or the diethylphosphoramidite method 
of Caruthers , described in U.S. Patent No, 4,415,732, 
issued November 15, 1983. 

Ligations are performed under standard conditions and 
temperatures (as described below) using T4 DNA ligase. 

In vector constructions employing "vector fragments", 
the vector fragment is commonly treated with bacterial 
alkaline phosphatase (BAP) in order to remove the 5' 
phosphate and prevent religation of the vector. BAP 
digestions are carried out under standard conditions 
(as described below) . 

D. Selection of transformants 
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In the constructions correct ligations for plasmid 
construction are confirmed by transforming E. coli 
or other suitable hosts with the ligation mixture. 
Successful transformants are selected by ample ill in , 
tetracycline or other antibiotic resistance or 
using other markers depending on the mode of plasmid 
construction, as is understood in the art. 

Brief summary of the invention 

The present invention relates to the production of EBV 
specific antigens by recombinant DNA technology and their 
U se in diagnosis, prophylaxis and therapy of EBV-related 
dieseases. Therefore, it is an object of this invention 
to identify novel Epstein-Barr viral antigens, such as p150, 
P 143, P 138, P 110, P 105, P 90, p80, p54 (G.J. Bayliss , H. Wolf, 
infra) , which are correlated with Epstein-Barr virus related 

diseases like nasopharyngeal carcinoma (NPC) , infectious 
mononucleosis, and Burkitt's lymphoma (see legend to 
Fig. 1 and Fig. 28) by immunological methods. 
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Another object of this invention is the localization and identifica- 
tion of genomic regions of EBV, for example as it has been cloned 
from B95-8 cells (American Type Culture Collection, Rockville , 
Maryland, USA (A1CC) CRL1612) (J. Skare, J.L. Strcminger, "Cloning 
and mapping of BamKE endonuclease fragments of the DNA from the 
transforming B95-8 strain of Epstein-Barr Virus" , Proc. Natl. Acad. 
Sci. USA 77, p3860(1980)) coding for said antigens of diagnostic im- 
portance" and of relevance for medical purposes . This is 
achieved by using the hybrid selection method. 



A further object of the present invention is the sub- 
cloning of a genomic region of EBV,for example from 
existing libraries of EBV, cloned from B95-8 cells which 
concedes at least a part of useful antigens for 
15 medical purposes, such as p138 and p150. This is achieved by joining a 
subgencmic fragment, e.g. ai Xhol- fragment , derived from the 
EBV B9 5-8 subclone pBR322 BamA, (J. Skare et al., supra) to the plas- 
mid pUC8 (J. Messing; infra) (pUC635 , see Fig. 4) . 

20 Another object of this invention is the production of 
proteins by expression of the respective genetic 

information in suitable host cells, such as bacteria 
(e.g. of the genera Escherichia, Salmonella, Pseudomonas or 
Bacillus) , yeasts (e.g. of the genera Candida or Saccharomyces) animal 
cells and human cells (e.g. Vero-cells; CHO-cells; CHO dhfr~ 
cells in combination with an appropriate selection system, 
optionally a plasmid carrying a functional dhfr gene as well 
as the genetic information for the EBV gene under control 
of a suitable regulation sequence; or lymphoblastoid cell 
lines). The proteins produced by these host cells contain 
e.g. p138, pl50 or gp250/350-related antigenic determinants 

(epitopes) and are, depending on the expression system, 
synthesized either as a fusion protein or as a non-fusion 
protein. 

35 
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For the production of a fusion protein by bacteria 
the expression of the genomic subf ragments t f or example 
that encoding a part of pi 38 of EBV B95-8 and introduced into 
the known plasmid pUC8, was induced e.g* by. isqpro^l-B-D-thioglacto- 
pyranoside (IPTG) . The respective expression products 
were identified by immunological methods. 

Another fusion protein is provided by cleaving subclone 
pOC635 with EcoRI and Bglll and introducing this frag- 
ment into the vector plasmid pUC9 (U. Rtither, infra) . The 

resulting recombinant plasmid is gj3C924 (Fig. 6) . The expression 
product has a size of about 94 kd. 

A further fusion protein is produced by expressing the 
genetic information of said XhoI-p1 38 -encoding fragment 
of pBR322 BamA in the plasmid pEA305 (E. Amann, J . Brosius, 
M. Ptashne, "Vectors bearing a hybrid trp-lac-promoter 
useful for regulated expression of cloned genes in E • coli" , 
Gene 25, p. 167 (1983)). After putting the p138 related 
information into a proper reading frame relative to pEA305, 
the. clone pMF924 synthezises a fusion protein that contains 
a part of the a. -repressor protein c^ (Fig. 7) . 

Still another fusion protein is provided by cloning a 3.0 kb 
genomic Xhol-f ragment containing p138-related genetic in- 
formation 3' to a hybrid trp-lac promoter (as described 
by F. Amann et. al. , supra) . For this purpose the kncwn plasmid 
pKK240-11 was used. The resulting clone pKK378 synthezises 
a fusion protein that is composed of an aminoterminal 
methionine residue followed by p138 related DNA sequences 
(Fig. 8) . 

Still another object of the present invention is to pro- 
vide fusion proteins, non-fusion proteins or oligopeptides 
<*hich contain only the antigenic determinant protein sub- 
regions of viral proteins like p138. For this purpose 
the determinants of the protein are located by a computer- 
directed analysis, using a computer program developed by 
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us for the Digital Equipment VAX 11/750 computer. A similar 
program has been used for other problems and another com- 
puter by G.H. Cohen, B. Dietzschold, M. Ponce de Leon, 
D. Long, E. Golub, A, Varrichio. L. Pereira, R.J. Eisenberg, 
"Localization and synthesis of an antigenic determinant of 
Herpes simplex virus glycoprotein D that stimulates the 
production of neutralizing antibody" , J. Virol 49, p. 102 

(1984). By cloning the respective fragments into vectors 
like pUC8 or pUR288 (U. Riither et al. , infra) plasmids 
as pUR600 and pUR540 were obtained. The produced large 
and small fusion proteins are investigated by gel 
electrophoresis and immunob lotting experiments. The clo- 
ning experiments in pUR288 were done for stabilizing 
the small p138-related polypeptides. 
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A further object of the present invention is the expression 
of polyantigens composed of antigenic determinants of 
several different EBV- serotype s . For that purpose the 
corresponding DNA fragments are linked and introduced 
in a suitable vector. The expression products are fusion 
and non-fusion EBV-specific polyantigens. 

Further fusion proteins containing p150-related antigenic 
determinants were obtained by cloning and expression of the 
corresponding DNA sequences in pUR plasmids and pUC plasmids 
The obtained constructs were the recombinant plasmids 
pUR290CXH580 , pUR290DBX320 , pUR292 DBB1 80 , pUR290DTT700 , 

* 

PURDTT740, pUR290DTP680 , and pUR288DPP320 . 



Another object of the present invention is the construction 
of new expression vectors, such as pUC600 and pUC601 
which contain a part of the coding region of the viral 
protein p138. If DNA sequences are introduced 3' to this 
sequence into the vector, the expression products are 
stabilized by the p138-specif ic aminoacid sequence and 
protected against protease degradation. 
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1 Still another object of this invention is the modification 
of said expression vectors by introducing a DNA-sequence 
coding for three to fifteen Arginine residues and at 
least one stop codon 3' of the cloning site of said ex- 

5 pression vectors and furthermore positioned in an 

appropriate reading frame. The obtained vector is pUCARG601 
If DNA-sequences coding for proteinaceous material are 
inserted into this expression vector the expression 
products will be fusion proteins carrying said Arginine 
10 residues at their carboxy terminus, such as those 
fusion proteins encoded by plasmids pUCARG1140 (see 
Fig. 12a)) and pUCARG680 . 



15 Thus it is an object of this invention to provide a simple 
method for isolating protedjss useful for diagnosis, prophylaxis and the- 

™« ioQflr related polypeptides or oligopeptides 
rapy such as EBV pl38 ireJ - < * ^ Jr r 

(antigens) from the host cell lysate according to the 
method of H. M. Sassenfeld, S.J. Brewer ("A polypeptide 
20 fusion designed for the purification of recombinant pro- 
teins' 1 , Bio/Technology 2, p. 76 (1984)). 
By,djit2»3uction of said Arginine residues 

the net charge of the expressed proteins becomes/positive 
and after lysis of the host cells the oligo-arginine 

25 linked proteins are isolated by a S? Sephadex C-25 column 
chromatography. Due to the oligo-arginine group ^he EBV 
specific proteins are eluted at a high NaCl-concentration. 
This eluate is then treated with carboxypeptidase 3 which 
degrades carboxy- terminal lysine and arginine residues. 

30 Finally another SP Sephadex C-25 chromatography is carried 
out wherein the EBV-related proteins are eluced at low 
salt concentrations (see Fig. 16). It is evident, however, 
that this procedure may be used aiso for the purifi- 
cation of proteins secreted into the medium. 
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It is also evident that other established methods for 
protein-purification such as molecular sieving or affinity 
chromatography on ion exchange columns or columns loaded 
with specific antibodies to the expressed proteins can be 
used as additional or alternate purification methods. 
For the production of non-fusion proteins which essen- 
tially contain amino acid sequences of the naturally 
occurring proteins or parts thereof the recombinant plas- 
mids of the present invention may be modified. If an 
oligonucleotide linker is inserted between the bacterial 
protein encoding region and the EBV-related protein encoding 
region of the expression vector, the amino acid sequence 
corresponding to the oligonucleotide linker becomes part 
ot the expressed fusion protein. After isolation of this 
fusion protein from the transformants expressing it, it is 
cleaved either by amino acid sequence specific proteases 
in the introduced aminoacid linker or, if the amino acid 
linker comprises peptide bands sensitive to acid cleavage, by 
treatment with acids, e.g. formic acid. 

A further object of this invention is the cloning of a 
genomic region of EBV coding for at least a part of 
the specific viral antigen gp 250 and. gp 350 ^ This is 
achieved by joining a subgenomic Pstl-PstI fragment of the 
EBV genome from the cell strain B95-8 (ATCC CKL 1612) (R. Baer et al. , 
infra) contained in pBR322 BamL (J. Skare et al, supra) to the plasmid 
pUC8 (J. Messing et al. , infra) . The resulting reaambinant plasmid 
is designated as. pUCLPI .9 (see Fig. 19). 

For the production of a fusion protein by bacteria, a 
genomic subfragment coding for a part of gp 250 and gp 350 
of EBV B95-8 was cloned into the vector pUR 290 (U. 
Rtither et al. , infra) which carries a region of the lacZ 
gene coding for the enzyme B-galactosidase (pURLPI . 9 , 
see Fig. 20) . The respective expression product was 
purified and identified by immunological methods. 
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1 Still another object of the present invention is to pro- 
vide fusion proteins or non-fusion proteins which con- 
tain only the antigenic determinant protein subregions 
of gp 250 and gp 350. For this purpose the antigenic 

5 determinants of the proteins were localized by a computer- 
directed analysis using a computer program developed 
by us for the Digital Equipment VAX 11/750 computer* 

Uie respective DNA-f ragments are then cloned in a rarxventicnal 
10 expression vector such as pUR (B-galactosidase) (U. 
Rtlther, et al., infra). Plasmids obtained were e.g. 
PURLEP600 and pURLXP390 (see Fig. 27). 
Furthermore/ the N-terminal antigenic determinant of 
gp250/350 was expressed as a fusion protein in a 
15 pUC vector ( pUCLEP600 , see Fig. 27) . 

Another fusion protein is provided by cloning a DNA-frag- 
ment coding for the N-terminal antigenic determinant of 
gp 250 and gp 350 into the expression vector pUCARG601 
mentioned above. 

20 

A further object of the present invention is the ex- 
pression of polyantigens containing several antigenic 
determinants of gp 250 and gp 3 50 located by said computer 
analysis. For this purpose the corresponding DNA frag- 
25 ments are linked and introduced in a suitable vector. 

The expression products are fusion and non-fusion EBV— 
specific polyantigens . 

A final object of the present invention is the utilization 
30 of either said EBV-related proteins or subregions thereof or, f if suit- 
able, EBV-related DNA fragments or clones , for the production 
of diagnostic compositions (kits) useful in clinical diagnosis or 
scientific research. These tests are based cn principles as ELISA 
(Enzyme-linked inmuno sorbent assay) , RIA (Radio iinnuno assay) or the 
35 indirect hemagglutination assay. Furthermore, the EBV- 



CM 73254 

- 29 - 



related proteins can be used, e.g. for monitoring 
vaccination programs, analyzing epidemiological problems / 
for patients treatment, and for the production 
of vaccines for prophylaxis and therapy of EBV- related 
diseases, such as mononucleosis, Burkitt's lymphoma and naso- 
pharyngeal carcinoma. Vaccines can be manufactured according 
to conventional methods. Unit doses are filled in vials 
optionally together with a conventional adjuvant such 
as aluminium hydroxide. Alternatively the product may 
be administered in the form of aggregates with liposomes. 
Patients may be vaccinated with a dose sufficient 

to. stimulate antibody formation and revaccinated after 
one month and after 6 months. 

Finally the proteins are useful for prophylaxis and 
therapy of EBV-related diseases, because they are 
able to modulate the immune response in patients 
suffering from diseases such as NPC, chronic infectious 
mononucleosis or EBV-related Burkitt's lymphoma. 

Brief description of the drawings 

Figure 1 : Autoradiography of an immunoprecipitation of 
EBV-specific sera derived from patients suffering from 
mononucleosis and NPC. 

The immunoprecipitated S-labelled proteins were se- 
parated by a SDS-polyacrylamide gel electrophoresis 
and an X-ray-film was exposed to the gel. 

The sources of the different sera used for precipitation 
are given at the bottom of the respective regions of the 
autoradiography. The control, designated "pool", contains 
all of the immunoprecipi tat able EBV-specific proteins. 
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It can be taken from the autoradiography that at least 
antibodies to ?138, p105 and p80' are present in each of 
the NPC sera and only in some of the other EBV- infection 
specific sera. In analogy, antibodies to p54 are significant 
for fresh E3V infection (infectious mononucleosis) as 
compared to convalescent state. Antibodies to p15o f p143, 
p11o, p90 are also present in convalescent sera of healthy 
individuals and can serve as markers for immunity or, 
in connection with IgM specific tests, for fresh EBV in- 
fection or, in connection with IgA, for a specific test 
for Z3V-related neoplasia (NPC and 3L) . 

Figure 2: Mapping of mRNA's relative to the EBV B95-3 
genome . 

The 3amHI restriction sites of the EBV B95-8 genome are 
given at the bottom of the figure and the respective 
restriction fragments are designated by upper and lower 
case letters. The mRNA's of the proteins localized by 
hybrid-selection to individual BamHI restriction fragments 
are indicated by numbers and lines. 

It can be taken from the figure, that the gene of p138 
was correlated to the BamA- fragment . 

Figure 3: DNA sequence of the leftward reading frame of 
3arnA encoding p138. 

The sequence shown is the respective negative strand. 

The pi 38 encoding region starts at nucleotide position 182 
and ends at nucleotide position 3565. Restriction sites 
used for cloning of fragments of this coding region are 
indicated. 

Figure 4: Restriction map of the plasmids pCJC635 and 
pCJC613o. 
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The size of the vector pUC8 is 2.7 kb. The cloning site 

3' of the lacUVS B-galactosidase promoter and operator 
(PO) contains EcoRI(E), BamHI(B) , Sall(S), Psti(p), 
and Hindlll(H) site. The B-lactamase gene is indicated 
by AMP . The 3.o kb and 3.3 kb Xhol-f ragments of the 
pi 38 coding region are inserted into the Sail site of pQC8. 
The insertion is indicated by an open bar. pUC635 contains 
the 3.o kb Xhol-f ragmen t in a correct reading frame relative 
to the S-galactosidase gene, whereas pUC6l3o contains the 
3.3 kb Xhol-f ragment in the opposite orientation. 

Figure 5: Expression of the proteins encoded fay plasmids pUC635, 
OUC924, pMF924, and pKK378. 

Lane 1 ; of the immunostained Western-blot shows the proteins 
isolated from bacteria transformed with pUC8 and induced 
with IPTG. 

Lane 2: proteins of pUC924 transformed bacteria 
Lane 3: proteins of pKK378 transformed bacteria 
Lane 4: proteins of pMF924 transformed bacteria 
Lane 5: proteins of pUC635 transformed bacteria 
The size of the fusion protein was estimated to be 
75kD (lane 2), 110 kD (lane 3), 90 kD (lane 4) 
and 135 kD (lane 5) . 

Figure 6: Restriction map of the plasmid pUC924. 

The size of the vector ptJC9 is 2.7 kb . The cloning site 3» 
of the lacUVS 3-galactosidase promoter and operator (PO) 
contains an EcoRKE) , BamHI(B), Sall(S), Pstl(P), and 
Hindlll(H) site. The S-lactamase gene is indicated by AMP. 
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1 The 2.6 kb Bglll/EcoRI-f ragment of pUC635 is inserted 
between the BamHI and EcoRI sites. The abbreviation of 
Bgill is "Bg" . 

5 Figure 7 ; R estriction map of the plasmid pMF924. 

The 2.6 kb BamHI/Hindlll-f ragment of pUC924 was. inserted 
into the BamHI and Hindlll restriction sites of pEA3o5 
which are located 3' of the hybrid trp-lac promoter 
10 (tac) and the aminoterminal coding region of c 1 (A-repress< 

Figure 8 : Restriction map of the plasmid pKK378. 

The 3.3 kb BamHI/Hindlll-f ragment of pUC613o was inserted 
15 into the Hindlll-site of the vector pKK24o-!1 using a 

345 bp BamHI /Hindi I I - f ragment of pBR322 as a linker (which 
is indicated by a heavy black line) . Thus the p138 en- 
coding fragment is located 3' of the hybrid trp-lac 
promoter (tac) and an ATG start codon. 

20 

Figure 9s Secondary structures of pi 38. 

Computer plot of Chou-Fasman calculation of the p138 
secondary structure. Additionally, the hydrophobic 
25 (closed circles) and hydrophilic (open circles) regions 
are indicated . 

Antigenic sites can be expected in hydrophilic regions 
with a 3 -turn. This situation is given in the p6oo region 
and at the carboxy-terminus of the protein. 

The regions subcloned into the vectors p(JC8 and pUR288 
are indicated. 

35 
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Figure 10 : Expression products of bacteria transformed 
with the plasmid pUR carrying PstI fragments of p138. 

A. A coomassie brilliant blue stained SDS polyacrylamide 
slab gel analysis of lysates of IPTG induced bacteria 
carrying the various plasmids is shown. Fusion proteins 
with molecular weights between 120 and 150 kd are in- 
dicated with a closed circle- Track M molecular weight 
markers tracks pUR400-pUR540 lysates of bacteria 
carrying plasmids containing the regions of p138 as 
shown in Fig, 3 . 

3 . An enzyme-linked immunoassay of proteins trains fered 
from a gel (similar to that shown in panel A) onto 
nitrocellulose paper (Western blot) is shown. 
In this assay a pool of high titered antiserum was 
<r used and after washing, the bound immunoglobulins 

were visualized by sequential reaction with peroxidase 
coupled to antibodies against human IgG and diamino- 
benzidine. Only fusion proteins from bacteria con- 
taining pUR600 and pUR540 show specific reactions. 
Piasmid pUC63 5 (as a positive control) contains 
almost the whole of p138 coding region, however the 
protein is unstable and is rapidly degraded. pUC8 
is the negative control containing the vector plasmid 
free from EBV derived sequences. 
Figure 1 1 : Expression product of bacteria transformed 
with the pUC subclones carrying Pstl-fragments of p138. 

An enzyme-linked immunoassay of proteins electrophoretically 
transfered from a gel onto nitrocellulose paper (Western 
blot) was carried out. In this assay a pool of high titered 
antiserum was used and after washing, the bound immuno- 
globulins were visualized by sequential reaction with 
peroxidase coupled to antibodies against human IgG and 
diaminobenzidine . The fusion protein from bacteria con- 
taining pUCP600 was stably produced and shows a specific 
antigenic reaction. 
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Figure 12; Construction scheme for plasmid PUCARG1 140 encoding both 
antigenic sites found by expression as B-gal fusion proteins 

a) The S'-PstI site in pUC600 was removed by digesting with 
SstI (20bp upstream) and Hindlll followed by the ligation 
with pUC12-SstI/HindIII. From this plasmid the insert was 
removed with EcoRI and PstI and ligated into pUC8-EcoRI. An 
oligonucleotide coding for five arginines and two stop 
codons was inserted into the resulting plasmid pUC601 as 
single- stranded DNA between the 3* -PstI site and Hindlll 
(pUCARG601). In a last step the 540bp PstI fragment encoding 
the second antigenic determinant from the C-terminus of 
p138 was inserted by digestion with PstI and ligation. 

The resulting plasmid contained both antigenic sites in 
frame followed by five arginine-residues . It was designated 
as pUCARG1140. 

b) Nucleotide sequence of the oligoarginine linker. The lower 
strand was synthesized and inserted as a single-strand DNA 

via bridge formation between the sticky ends of PstI and Hindlll. 

Figuxe 1 3 ; IPTG- induced expression of the plasmids pUC600, 
PUC601, pUCARGSOl and pUCARG1140 with pUC8 as a control 
The upper part shows a Cocmassie-stained SDS-PAGE. The newly 
detected proteins are marked by a black dot. The lower 
part shows the corresponding western blot obtained after 
immunostaining with serum from NPC patients. In comparison 
to pUCP600 the EBV-related protein encoded by pUC601 is 
about 1.5kD smaller due to the lack of 14 aminoacids (6 araino- 
acids encoded by the pUC-poly linker and 8 from the Pstl-SstI 
fragment). The size of the protein encoded by pUCARG601 
is further reduced for about 11kD since the read through 
into the lacZ region of pUC is inhibited by stop codons pre- 
sent in the inserted oligonucleotide. In pUCARG1140 the size 
increases to about* 42 kD due to the insertion of the 
540bp fragment. The protein is stable in bacterial cells. 
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Figure 14; Distribution and reactivity of the IgG and 
IgA antibodies of individual NPC-sera against the 
two epitopes detected In pi 38. 

Lysates of IPTG- induced E.coli cells carrying the indicated 
plasmids were independently separated on a 12% SDS-PAGE 
four-times and the proteins were transferred to Nitro- 
cellulose by Western-blotting, Lanes 1: pUR288 as 
negative control; lanes 2: pUCARG1140 as a positive 
control; lanes 3: pUR540; lanes 4: pUR600. Two indi- 
vidual NPC-sera (no. 352 and 354) were incubated with 
the filters and the bound IgG and IgA antibodies were 
visualized using peroxidase conjugated anti-human IgG and anti-human IgA 
rabbit antibodies. The different locations of the 
proteins in the Western blots, especially of pUCARG1140, 
result from different electrophoresis times of the 

* 

SDS-PAGEs. 

Whereas in NPC-serum no- 352 the main reaction of the 
IgG and IgA antibodies is directed against the P540 epitope 
from the C- terminus of p138 (see Fig. 9) in serum no. 354 
the main part of the anti-pi 38 antibodies recognizes the 
P600 epitope (see Fig. 9). This indicates that both anti- 
genic sites are necessary for detecting anti-p138 anti- 
bodie s in s era * 

Figure 15; ELISA test using the protein encoded by plasmid pUCARG1140 
as antigen. 

Row 1 and 3: EBV-negative sera, row 2: NPC pool serum, 
row 4-1 3s individual NPC sera. The dilutions tested are 
indicated at the bottom; left lane: IgG right lane: IgA. 



C173254 



- 36 ~ 

Figure 16,: Purification of proteins carrying oligo-arginine 
groups at their carboxy-terminus. 

A. Sequence of the oligonucleotide encoding five argine 
residues and two stop codons. A Hindlll-site at the 
5' -end and a Pstl-site at the 3 • -end were generated 
for the insertion of the oligonucleotide into pUC8 . 



B. Purification scheme of insoluble expressed eukaryotic 
proteins carrying said Arg-linker at their carboxy- 
terminus . 



Fig. 1 7 : 

DMA sequence of the leftward reading frame of the 3am L-fra< 
ment encoding gp 250/350. 

The coding region for the glycoprotein starts at genomic 
position 92153 and ends at position 89433. The sequence 
shown is the respective negative strand, beginning with 
the Ban** site at position 92703. According to the sequence 
numbering in this figure the gp 350 encoding region is 
located between position 556 and 3276. A TATAA-box in 

the region of basepair 520 is marked with the 

probable poly-adenylation site at position 3290 with 
+++ . The splice donor and splice acceptor sites are indicated 

by } ( for donor and ) ( for the acceptor site. 

A hydrophobic region near the carboxy-terminus 
of the coding region is marked with ***. Probably this 
aminoacid sequence serves as an anchor sequence for 
fixing the protein to the membrane. 
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Fig. 1 8: 

Restriction map and open reading frames of the Bam L- 
fragment 

A. Restriction map: 

The positions of the restriction enzymes Bam HI, 
EcoRI, Hindlll and PstI are indicated relative to 
the nucleotide positions of the Bam L-fragment. 

B. Open reading frames: 

The open reading frames of the Bam L-fragment are 
indicated as boxes and given for both polarities 
of the respective DNA sequence . 

Fi 9- 19: 

Restriction map of the plasmid pUCLPI.9 

The size of pUC8 is 2*7 kb. The cloning region 3' 
of the LacUVS B-galactosidase promoter and operator (PO) 
contains an EcoRI (E) , BamHI (B) , Sall(S), Pstl(P) , and 
Hindlll(H) site. The 1 . 9 kb subfragment of the Bam L- 
fragment/ indicated by an open bar, was inserted into 
the PstI site. The reading frame has the same orientation 
as the lacZ-coding part of pUC8 (indicated by a heavy 
black line) . 
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Fig. 20: 

Restriction map ofplasmld pURLP1.9 

The vector pUR290 has a length of 5.2 kb and consists 

of the B-lactamase gene (AMP ) and the origin of re- 
plication of pBR322. The 6-galactosidase gene is indi- 
cated by a heavy black line, the respective promoter- 
operator region by PO. The restriction enzymes are 
abbreviated as follows: BamHI (B) , Clal (C) , EcoRI (E) , 
Hindlll (H), PstI (P) , and Sall(S). 

The 1,9 kb insert of pUCLP1.9 was introduced between the 
BamHI and the Hindlll site* 

Fig. 21: 

DNA- and amino acid sequence of the fusion protein encoded 



by plasmid pURLP1.9 



bp 4 
bp 3070 
bp 3073 

bp 3089 
bp 4986 



3069 
3072 
3088 

4985 
4994 



bp 4 99 5 - END: 



fi-galactosidase 

pUR290 linker (given in low letters) 
pUC8 multiple cloning site (BamHI to 
PstI; given in low letters) 
PstI fragment of gp 350 
pUC8 multiple cloning site 
(PstI to Hindlll; given in low letters) 
pBR3 2 2 sequence . 



Fig. 22: 



Expression of the B-gal : gp 350 protein encoded by plasmid pURLP1.9 



Lane 1 and 2 show a coomassie blue stained PAGE of an 
uninduced (lane 1) and an IPTG induced (lane 2) pURLP1.9 
containing clone. 
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Since there are a lot of bands with different molecular 
weights, it seems that the main part of the protein is 
incompletely synthesized. 

Lane 3 shows a peroxidase-DAB stained Western blot with 
NPC sera. It is demonstrated that all newly expressed 
proteins are antigenic, except that band according to 
the size of 116 kD which corresponds to the B-galactosi- 
dase. 

The bacterial background bands are due to the high content 
of antibacterial-antibodies in the serum used. 



Fig. 23; 

Purification of the B-gal : 
pfrfifTOid 



350 fusion protein encoded 



. - _ j . n western blot, treated with 
A. Coomassie stained geJ_? »• wesLeiu ' 

NPC serum 

Lane 1 : Uninduced culture 
Lane 2: IPTG induced culture 

Lane 3: Insoluble proteins of the lysed bacteria, disolved 

in 8M urea 

Lane 4: B-gal : gp 350 protein containing fractions, pooled 
after Sepharose 2B-C1 chromatography. 



.Figure 24; Computer-predicted secondary structure 
of gp3 50 comprising the relative values of hydr oph ilicity 
(dark circles) and hydrophobic ity (grey circles). In 
the scale given only the loop structures can be seen 
clearly as line turns of 180°. 
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Figure 25; Expression of gp350-f ragments as B-gal fusion 
pro -be ins. 

The coomassie blue stained expression products encoded by p l a sm id s 
pURLEP600 and pORLXP390 are shown in the upper part 
(pUR288 as control) - In the lower part the same probes 
are shown after immuno staining for demonstrating their 
reactivity with EBV-positive sera. 

Figure 26: Expression of the proteins encoded by pUCLEP600 
and pUCARGI 230 and their reactivity against EBV-positive 
sera with pUC8 as control; upper part: coraassie- stained 
SDS-PAGE, lower part immunostained westernblot 

Figure 27: Restriction map of the region coding for 
gp 250/350 

The dark bar indicates the region coding for gp 250/350. 
Furthermore the restriction enzymes used for subcloning, the splice 
sites , and the inserts of the recombinant expression 
plasmids constructed according to examples 13 and 15-17 
are shown. 

Figure 28: DNA sequence and corresponding aminoacid secruence 
of EBV-related proteins. 

A. Protein p54 

Nucleotide sequence and derived aminoacid secruence 
of protein p54 which is identified in in vitro 
translation as p4 7" but correlated with immuno- 
precipitation with monoclonal antibodies 

3. Protein p90 

C. Protein p143 

D. Protein pi 50 
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1 Figure 29; Expression of the B-gal::p150 fusion proteins 

IPTG- induced clones indicated on top were separated after 
lysis in an 10 % SDS-PAGE and the proteins were stained 
5 with Coomassie-blue. As a control pUR288 was applied to 
show the size of the fi-galactosidase. All clones produce 
new proteins larger than the control clone and correspon- 
ding to the insert size. 

10 Figure 30: Antigenicity of the fi-gal: :p1 50 fusion proteins 

The same lysates from clones shown in Fig. 29 were trans- 
ferred to nitrocellulose and EBV-related antigens were 
visualized by immuno staining (see supra) . The clone en- 
15 coding the N- terminal part reacts strongly. 

Figure 31: Map of the pi 50 encoding region 

The p150 encoding region is shown as dark bar. The re- 
20 s trie ion sites used for subcloning and the resulting 
pUR-clones are also indicated. 



25 



30 
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Best: mode of carrying out the invention . 

■ 

Examp Le 1 

Identification of an antigen suitable for diagnosis of NPC 

£n order to obtain the desired DNA sequences coding for 
E3V-related antigens of diagnostic significance the 
following strategy was developed: 
Immunoprecipitation of Epstein-3arr viral proteins with 
various sera from normal adults, patients with fresh 
infectious mononucleosis or nasopharyngeal carcinoma 
was used to identify antigens, which are of relevance for 
the diagnosis of immune status and characteristic f or a 
particular disease (Figure 1 ) . These antigens have been 
localized on the Epstein— 3arr virus genome by hybrid -se- 
lected translation. With the use of sequence data, these 
genes were subcloned from EBV-QNA and expressed in eu- 
caryotic and procaryotic ceils. 

It was shown by immuno precipitation that EA and VCA are 
not single antigens but families of antigens that consist o 
several polypeptides (G-J. Bayliss, H. Wolf, "The regu- 
lated expression of Epstein-3arr virus. III. Proteins 
specified by EBV during the lytic cycle", J. Gen. Virol 56, 
p. 1o5 (1981 ) ) . 

For the immunoprecipitation the E3V-p reducing , MA-positive 
ceil line P3HR1 , the E3V-positive , non-producing Raji cell 
line and the EBV-negative cell line 3 J A3 were used. When 
the cells reached a density of about 1o /ml, they were 
diluted with an equal volume of fresh medium. For induction 
of E3V antigens, P3HR1 cultures were treated with 4o ng/mi 
phorbol- 1 2-my strats- 1 3 -acetate (modified from :ur h'ausen 
et al (H. zur Hausen, F.J. O'Meili, U.K. Freese, E . Hecker, 
"Persisting oncogenic herpes virus induced by the tumor 
promoter TP A" , Mature 272, p. 373' (1978)) an d 3 aiM butyric 
acid immediately after subculture. For the labelling of 
the proteins, the cells were collected bv low— soeed centri— 
fugation and resuspended at a density of 2 x 1 o 6 - cells/ml 



0.1732 



10 



15 



- 43- - 



in methionine-f ree MEM culture medium containing between 

35 

5o and 1oo uCi/ml S-methionine . The cells were incubated 
at 37°C/5 % CC>2 for 4 h and subsequently washed with cold 
Hanks' phosphate buffered saline (PBS) and resuspended in 
cold IP buffer (1% Triton-X-loo , o.1% SDS; o,137 tl NaCl; 
1 niM CaCl 2 ; 1 mM MgCl 2 ; 1o% glycerol; 2o mM Tris-HCl pH 9.o; 

o,ol % NaN-j ; 1ug/ml pheny line thy 1 sulphonyl fluoride) at a 

6 

concentration of 5 x 1o cell/ml. Then the cells were dis- 
rupted by sonication and incubated on ice for 6o min. The 
extracts clarified by centrif ugation at 1oo,ooo x g for 
3o min at 4°C. 
35 

S-methionine labelled extracts were immuno-precipitated 
exactly as described (G.J, Bayliss, et al . , supra). 
The results are shown in Fig. 1. 



Antibodies to p138, p1o5, p90 and p80 are present in each 
of the NPC sera and only in some of the other EBV- infection 
specific sera. In analogy antibodies to p54 (identical 

20 to P 58 in G.J. Bayliss et'al., supra) are significant for 
fresh EBV- infection (infectious mononucleosis) as compared 
to convalescent state. Antibodies to p150, p143, p1!0 
are also present in convalescent sera of healthy indi- 
viduals and can serve as markers for immunity or, in 

25 connection with IgM specific tests for fresh EBV- infection 
or, in connection with IgA specific tests, for EBV-related 
neoplasia (NPC and BL) . 

The next step was to localize the antigens on the 
30 EBV genome. Therefore RNA was prepared by lysing the 

EBV-producing cells described above with 4 M guanidine 
isothiocyanate and o.5 M 2-mercaptoethanol two days after 
induction (J.M. Chirgwin, A.E. Przybyla, R.J. MacDonald, 
W.J. Rutter, "Isolation of biologically active ribonucleic 
35 acid from sources enriched in ribonuclease" , Biochemistry 
18, p. 5294, (1979)). The lysate was centrifuged for one 
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hour at 20.000 rpm (SW 41 , Beckmann) and the supernatant 
layered on top of 2 ml CsCl density 1.8 g/cm 3 . After 
centrifugation for 17 hours at 150.000 g, the RNA pellet 
was extracted with chloroform and precipitated with 
ethanol. 100 (jig total cellular RNA was hybridized for 
2.5 hours at 52°C in 6 5 % formamide and 0.4 M NaCl to 
16 ^g cloned EBV-DNA, which was sonicated, denatured 
and spotted on small nitrocellulose filters. Bound mRNA 
was eluted by boiling the filters 90 sec in water. The RNA 
was translated in vitro with a mRNA dependent rabbit reti- 
culocyte lysate. The translation products were immuno- 
precipitated using 5 ul of a pool of human NPC sera for one 
assay after preincubation with a protein extract from un- 
labelled EBV-negative BJA-B cells as previously des- 
cribed (G.J. Bayliss, G. Deby, H. Wolf, "An immunopreci- 
pitation blocking assay for the analysis of EBV induced 
antigens", J. Virol. Methods 7, p. 229 (1983)). The 
immune complexes were bound on protein A-sepharose, washed, 
eluted by boiling the beads in electrophoresis sample 
buffer and loaded onto SDS-polyarcrylamid gels. This 
procedure allowed mapping of a number of viral proteins 
(Fig. 2) relative to the EBV B95-8 genome. The localization 
of p138, is given in Fig. 2. Using sequence data 
(R. Baer, A.T. Bankier, M.D. Biggin, P .L.Deininger , P.J. 
Fawell, T. J. Gibson, G. Hatfull, G.S. Hudson, S.C. Satch- 
well, C. Seguin, P.S. Tuffuel, B. Barrell, "DNA- sequence 
and expression of the B95-8 Epstein-3arr virus genome", 
Nature 31o, p. 2o7 (1984)), appropriate open reading 

frames for p138 and p54 were identified (Fig. 2). These 
open reading frames are completely contained in the right 
part of the BamA-f ragment at the right end of the viral 
genome . 
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1 Example 2 

Cloning of the pi 38 encoding region 

5 According to the sequence data of R. Baer et al., (supra), 
there is a large open reading frame contained in the 
BamA- fragment of EBV 395-8 which is suitable for encoding 
p 1 3 8 . The nucleotide sequence , the corresponding aminoacid sequence 
and the respective regulatory elements of the gene of 
10 p138 are given in Fig. 3. 

50 ug DNA of the plasmid pBR322-BaraA (J.Sksgre et al, , supra) were .di- 
gested with 50 U Xhol (Boehringer) for 2 h at 37°C in a total volume 
of 1 So nl containing 15o mM NaCl , 1o mM MgCl^/ 6 mM mer- 

15 captoethanol, 6 mM Tris-HCl, pH 7*9. 3o ul stop buffer 

{1o mM Tris-HCl, 5o mM EDTA, 6o % sucrose, 1 % bromphenol- 
blue, pH 7.5) were added, the mixture was put onto a 
preparative 1 % agarose gel in acetate-buffer (o-o4 M Tris- 
acetate, 2 mM EDTA, pH 7.6), and electrophoreses for 16 h 

20 with 4o V at 4°C. As a size marker Hindlll digested ^--phage 
DNA (Boehringer) was used. After staining the gel in Tris- 
acetate buffer with ethidium bromide (0.5 ng/ml) for 
1 h at room temperature (RT) , the DNA was visualized by UV- 
illumination and the bands corresponding to 3.o and 3.3 kb 

25 were excised (the 3.o kb Xhol-generated fragment is the 
desired fragment, the 3.3 kb Xhol-generated fragment is 
a partial digest product (one Xhol restriction site was 
not cut) ) . 

30 The DNA of the bands was eluted by putting the agarose 

pieces into dialysis bags, adding 3 volumes of Tris-acetate 
buffer and electrophoresed for 4 h (1oo V, 4°C) . Further 
purification was carried out by a chromatography with 
Elutip D columns (Schleicher & Schuell) according to the 

35 procedure recommended by the manufacturer ,' extraction of 

the contained ethidium bromide with isoamylalcohol and pre- 
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cipitation of the DNA by adding 2.5 volumes ethanol and 
incubating overnight at -2o°C. The DNA was collected by 
centrifugation in a Sorvall S3 34 rotor (17.ooo rpm, 2o 
min) and washed with 7o % ethanol. After lyophilization 
the DNA was dissolved in 15 ul TE buffer (1o mM Tris-HCl, 
1 mM EDTA , pH 7.5). 

The DNA concentration of the two isolated fragments was 
estimated by electrophoresing 1 \il each in parallel with 
1oo ng and 1 ug of pUC8 DNA. 



Sail digested DNA of the vector pUC8 (deposited with the Deutsche 

Saannlung fiir Mikroorganismta (DSM) , GQfctingen , West Germany, 

15 under the accession imriber DSM 3420 ) (J. Messing, J. Vieira, 

of double-digest restriction fragments", Gene 19, p. 269 

(1982)) was prepared as described before, except that for 
inhibition of religation of the vector during the following 
ligase reaction the DNA was treated with alkaline phosphat- 
ase (o.5 units (Boehringer) , 3o min at 37°C) . 



In the following, the two purified fragments were each 
inserted into the cleaved vector (Sail and Xhol produce 
the same cohesive ends, i.e. -TGCA-) . For this purpose 
for each of the fragments a ligation reaction was carried 
out with 3oo ng fragment DNA and loo ng pCJCS DNA in a total 
volume of 2o ul ligase buffer (1o mM Tris, 1o mM MgCl 2 , 
6 mM mercaptoethanol, o.6 mM ATP, pH 7.5) containing 
1U T4-DNA ligase (Boehringer) . After 2o h at 14°C, 80 *il 
TE buffer and 200 ul competent E.coli JM83 cells (ATCC 35607) (J. Vieira, 
J. Messing, "The pQC plasmids , an M1 3mp7-derived system 
for insertion mutagenesis and sequencing with synthetic 
universal primers", Gene 19, p. 259 (1982)) were added. 
The transformation was done according to the calcium 

chloride procedure (M. Mandel, A. Higa, "Calcium dependent 
bacteriophage DNA infection", J. Mol. Biol. 53, p. 154 
(197o)). Then the cells were mixed with 1 . 5 ml L-broth 
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(5 g yeast extract, 1o g tryptone, 5 g NaCl) incubated 
1.5 h at 37°C, and finally plated on L-broth agar-plates 
(1.5 %) supplemented with 5o ng/ml Ampicillin (Sigma) and 
4o ug/ml X-gal (Boehringer) . During this incubation 
bacteria carrying religated pUC8 molecules yield blue 
colonies a nd those which carry recombinant plasmids yield 
white colonies. 



For identification of clones that carry the desired re- 
combinant plasmid, twelve white colonies were picked and 
grown overnight at 37°c in L-broth. Aliquots of DNA -pre- 
parations according to H.C. Birnboim and j. Doly ("A rapid 
alkaline extraction procedure for screening recombinant 
plasmid DNA", Nucl. Acids Res. 7, p. 1513 (1979)) were 
digested by BamHI and Hindlll and electrophoresed on an 
agarose gel as described before. Furthermore, for demon- 
strating the orientation of the integrated 'fragment, a 
digest was carried out with BamHI and Bglll. Finally 
the 3.3 kb was checked by a Xhol digest. 

Plasmid pCJC635 carries the 3.0 kb Xhol-subfragment of 
the BamA-fragraent (pBR322 BamA) in the proper orientation 
and the proper reading frame relative to the lac UV5 
promoter and is used for the expression of nearly the 

whole p 138 (Fig. 4) . Hie fusion protein encoded by pUC635 is composed of 12 
amino acids of the B— galactosidase amino terminus , about 
1o2o amino acids of pi 38, 6o amino acids of the carboxy 
terminal part of the 3 -galactosidase and another 29 amino 

acids of a pBR322 encoded region . Plasmid pUC6130 carries the 
3.3 kb fragment in the opposite orientation (Fig. 4) Since 
the strain E.coli K12 JM83 is not a B -galactosidase 
repressor overproducer , the fusion protein is constitutively 
expressed. Therefore the plasmid pUC635 was introduced 
into the B-glactosidase repressor overproducer strain 
E.coli K12 BMH71-18 (DSM 3413) (u. Riither, B. Muller-Hill, "Easy 
identification of cDNA clones", EMBO Journal 1o, p. 1791 
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(1983)). Instead of strain E. coli K12 MBH71-18 strain 
E.coli K12 JM109/can also be used (without essential altera- 
tion of the experimental procedure) • 



Besides pUC635 three other plasmids were constructed :pUC92 4 , 
pMF924 and pKK378 (Figures 6 to 8) . 

The insert of pKK378 starts at the same Xhol-site and con- 
tinues up to the third Xhol-site located 250bp 3' of the stop 
codon. This fragment of 3-3kb was generated by an incomplete 
digest and inserted behind the tac-promotor and the start co- 
don of pKK240-11 (F.Amann et al-, supra). The expression pro- 
duct contains only two bacterial amino acids and its size is 
smaller then the size of the expression product of pUC635 
because the bacterial lacZ part is missing. 

PUC924 contains the fragment from the Bgl II-site to the 

(DSM 3421 ) 

- third Xho I site. pUC9/was used as vector. Since the size of the 
insert is smaller than in pUC635 and since the stop codon from 
p138 is used, the molecular weight of the expression product 
is expected to be smaller than in pUC635 and pKK378. 

The plasmid pMF924 was constructed frcm pEA305 (E. Araann et al., supra) 
and the same Bglll-Xhol fragment as in pUC924 . pEA305 has a tac-promotor 
followed by the N-terminal part of the C1 repressor, the 
resulting fusion protein is expected to be 17kd larger than in 
pUC924. 

These constructs were tested for the production of EBV-related 
antigens by inducing the tac- and lac-promotors with IPTG and 
separating the proteins on an SDS-PAGE. None or only weak 
new bands could be detected on Coomassie-blue stained gels in 
the regions with the expected sizes. But after a transfer of 
the proteins onto nitrocellulose and immunostaining with a 
high titered NPC-pool serum and a peroxidase conjugated second 
anti-IgG antibody new EBV-specific bands were clearly detec- 
table in all constructs. (Fig. 4) 
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All expressed proteins display almost the expected size, 
but the yield varied over a wide range. The proteins enco- 
ded by pUC635 and pMF924, seem to be more stably express- 
able than the non-fusion proteins from pUC924 and pKK378. 
However, the amount of even the highest expressed protein 
from pUC635 is too low for a large-scale production since 
in the Coomassie-stained gel only a very weak band was 
visible which may be due to the large size of the eukary- 
otic protein. 

Example 3 

Immunological assay of the proteins encoded by pUC635 
PUC924, pMF924 and pKK378 

The host cells transformed with plasmids pUC635, pUC924, 
pMF924 and pUK378 were cultivated in L-broth supplemented 
with 50 ug/ml Ampicillin to a cell density of D 60Q =0.8. 

Then, for the induction of the B-galactosidase the lactose 
analogon isopropyl-3-D-thiogalactopyranoside (IPTG; Sigma) 
was added (final concentration: 1mM) . After a further 
incubation of 1.5 h at 37°C, 1.5 ml of the culture were 
centrifuged. The bacteria were resuspended in 2oo ul boiling 
mix (2 % SDS, 5 % mercaptoethanol , 3 % sucrose, 5o mM 
Tris-HCl, pH 7.o) and heated for 1o min at 1oo 3 C. 
2o y.1 of the resulting protein extract were separated on 
a 12.5 % polyacrylamide gel and finally the proteins were 
visualized by cooma ssie-blue staining, but since the 
yield of the expression product is very low, an immuno- 
staining was necessary. Therefore the electrophoretically 
separated proteins were transferred to a nitrocellulose 
filter, i.e. a "Western-blot" was prepared (J. Renart, 
J. Reiser, G.R. Shark "Transfer of proteins from gels 
to diazobenzyl-methyl-paper and detection with antisera", 
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Proc. Satl. Acad. Sci. USA 76, p. 3116 (1979), S. Modrow, 
H. Wolf. "Characterization of herpesvirus saimiri and 
herpesvirus ateles induced proteins", in: Latent Herpes 
Infections in veterinary Medicine, Martinus Nijnoff Publ., 
p 1o5 (1984) ) . 

The Western-blot was prepared with a current intensity 
of o . 8 A for 3 h in Western-blot buffer (72 g glycin, 
15 g Tris, 1 1 methanol, H 2 0 dest. ad 5 1) . Then the 

nitrocellulose was saturated with Cohen buffer for 3 h 
(o.1 % Ficoll 4oo, 1 % polyvinylpyrrolidone, 1.6 % BSA, 
o.l % NP4o, o.o5 % gelatine, .o.l7 H'^BOj, 28 mM NaOH, 
15o mM NaCl, 6 mM NaN 3 , pH 8.2) and incubated overnight 
with 1:5o diluted high titered EBV specific serum from 
NPC-patients . The serum had been preabsorbed to a bacterial- 
protein extract (1 ml/1o 9 E. coli cells) to reduce the 
bacterial protein generated background- Afterwards unbound 
IgG was removed by washing the nitrocellulose filter for 
5 h in gelatine buffer (5o mM Tris-HCl, 5 mM EDTA, 1 5o mM 
NaCl, o.25 % gelatine, o.5 % Triton, o.2 % SDS, pH 7.5). 
For visualizing the blotted EBV-specific proteins rabbit 
anti-human-IgG-antibodies coupled to peroxidase and 
diluted 1:200 in TN buffer (154 mM NaCl, 1o mM Tris, pH 7.4) 
was added. After 2 h at RT, unbound rabbit antibodies were 
removed by washing with gelatine buffer as described 
above. Finally the peroxidase reaction was carried out in 
1oo ml 5o mM Tris-HCl, pH 7.5, by adding 5o mg diamino- 
benzidine (Sigma) and 4o nl H 2 0 2 and incubating 1o min. at 
RT. The results of this experiment are shown in Fig. 5. 
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Example 4 

Purification of the B-gal : pi 38 fusion protein encoded 
5 by the plasmid pUC635 

The clone E.coli K12 JM109 pUC 53 5 was grown at 37°C 

in 5oo ml L-broth supplemented with Ampicillin as described 
above until the OD 56Q was o.8. The fusion protein synthesis 
was induced by IPTG (1 mM) and the incubation was continued 
for another 2 h. Then the cells were collected by centri- 
fugation for 1o min in a GSA rotor (Sorvall) at S.ooo rpm 
and they were resuspended in 5o ml 2o mM Tris-HCl, pH 7,5. 
For lysating the cells, EDTA (5o mM final concentration) 
!5 and lysozyme (2 mg/ml final concentration) were added and 
this mixture was incubated for 3o min at 37°C. In the 
following, the cells were sonicated (Labsonic 151o, Braun) 
twice for 8 min, Triton X-1oo was added to a final con- 
centration of 3 % and, after further incubation at 37°C 
for 3o min, insoluble particles of the suspension were 
pelleted by centrif ugation (SS 34 rotor (Sorvall) , 2o min, 
lo.ooo rpm) . The resulting pellet was dissolved in 2o ml 
of an 8 M urea, 1o mM Tris-Hcl* o,5 % B-mercaptoethanol , 
pH 7.5, solution and recentrif uged as before. 



Finally 80 mg of the proteins were subjected to a column 
chromatography (Sepharose 2B-C1 (Pharmacia) , length 80 cm, 
diameter 3 cm) with 8 M urea, 10 mM Tris, 0.1 % B-mercapto- 
ethanol, pH 7.5, buffer. 30 \il of each of the collected 
4 ml samples were analyzed in a 15 % PAGE and the fusion 
protein -containing fractions were pooled. 
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Example 5 

Cloning of pi 38 subregions coding' for antigenic determinants 
5 identified by computer analysis 

In principle in diagnostic tests only the antigenic de- 
terminant subregions of • the antigenic protein are needed. 
Therefore the pi 38 amino acid sequence was analyzed by a 
computer programm and the identified subregions of this 
10 gene were introduced in suitable vectors. The production 
of such small proteins has the advantage that these are 
less vulnerable to rapid changes of antigenicity with 
decreasing length of the product. Furthermore especially 

in conjunction with assays for class specific antibodies 
15 they wiii be of diagnostic value. 

According to the method of P. Chou and G. Fasman 
("Conformational parameters for aminoacids in cC-^elical 
B-sheet and random coil regions calculated from proteins" , 
20 Biochemistry 13, p. 211 (1974)) the calculation of the 

appropriate secondary structure of a protein caused by 
its aminoacid sequence (primary structure) is possible. 
Superimposed on the suggested structure f the program de- 
termines the relative hydrophilicity and hydrophobic ity . 
25 Both data sets are combined and a computer graphic is 

drawn that shows ^-helical, B-sheet, B -turning and randomly 
coiled regions of the secondary structure. Thereby the 
hydrophilic and hydrophobic regions are shown as open and 
closed circles, respectively . 



An example of such computer graphic is shown for the 
p138 amino acid sequence in Fig. 9. 
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Based on the assumption that antigenic sites are mainly 
located in hydrophilic B-turns which are located on the 
surface of the protein, the region between about amino 
acid 52o and the car boxy- terminus of pi 38 should be anti- 
genic. The corresponding DNA sequence is represented by 
a Pstl-fragment of pUC635. 



Thus pUC635 was cleaved with PstI and all Pstl-f ragments 

10 were isolated and introduced into PstI -cleaved p(JC8 , 

the remaining vector fragment with additional 400 bp (up 

to the first Pstl-site of the p138 coding sequence) was 

reiigated (all methods as described in example 2) . 

15 The resulting recombinant plasmids were designated pUC 
P400, pUC P380, pUC P600, pUC P210, pUC P750, and 
pUC P540, respectively. 

The aminoterminal region of the pi 38 encoding sequence was 
cloned by digesting the plasmid pBR3 22- BamA with PstI and 
HgiAI and inserting said fragment into PstI cleaved pUC9* 
(J. Messing et al., supra) (methods as described in ex- 
ample 2) . The resulting recombinant plasmid is designated 
pUC HP. 

25 With the exception of pUC HP -in which translation stops 

at the 3 1 end of the insertion, in all subclones orientation 
and reading frames relative to pUC8 are correct. 

Finally the recombinant plasmids were introduced into 
30 E. coli K12 JM 109 cells. 



35 
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Ex pression of antigenic determinants identified by computer 
anal ysis using pUR288 p138 subclones 

5 

Since pUC subclones (example 5) scmtimes are not stably 
expressible in bacteria, because they cannot build up 

• a suitable tertiary structure due to their shortness 
and therefore can be degraded by proteases to a larger 

10 extent than ccnplete proteins we constructed recombinant plasmids encoding 
large fusion proteins using at least a part of the B-galactosidase 
encoded by pUR288 (DSM 3415) (U. Rttther et al. , supra) by cleaving 
said Pstl-fragment subclones with BamHI and Hindni, isolating the 
respective' fragments and ligating them into BamHI and Hindlll 

1 5 cleaved pUR288 (all methods as described in example 2) . 

The expression was carried out in E.coli K12 JM109. 
The products were analyzed as described in example 3 . 
After coomassie-blue staining of the gel several large 
20 fusion proteins of different size were detected, however, 
after preparation of a Western-blot, only the products 
expressed by pUR6oo and pUR54o showed specific reaction 
with the IgG antibodies mentioned (Fig. 10) . 

* 

25 These results are in good agreement with the computer 
analysis. 

Additionally the expression of the clones obtained 
according to example 5 was carried out according to 

30 example 3. The products, too, were analyzed as described 
in example 3. From the coomassie-blue stained gel it 
can be taken that only plasmids pUCP600 and pCCP380 code for a 
stable fusion protein. The Western-blot shows that only 
pUCP600 derived fusion protein is antigenic (Fig. 11). 

35 This fusion protein contains 11 amino acids encoded 
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by the aminoterminal cloning site, a region encoded by 
about 600 bp of p138 and carboxyterminal amino acids of 
the lacZ gene. Thus, the recombinant expression plasmids 
pQR600 and pUR540 as well as pUCP600 can be used for 
the production of large and small fusion proteins, 
respectively, containing an antigenic determinant 
of EBV-protein p138. 

Example 7 

Application of the protein mcoded byi plaanid pOGP600 for th e 
stabilization of per se unstable parts of eukaryotic 
proteins 

By means of the experiments of example 6 it was shown 
that the p138-derived protein parts (regions) are 
unstable with the exception of the protein encoded by plasmid 
pUCP600. The second antigenic region from the C-terminus of , 
pi 38 (P540, see Fig. 9) is not stably expressible using the 
recombinant pUC-vector pUCP540. 

The ability of the P600-region of p1 38 to stabilize 
such a per se unstable expression product is shown in 
this example. 

For this purpose it was necessary to remove the 
S'-Pstl-site of J&^PSOO by digesting the plasmid with SstI 
and Hindu I (the SstI site is located about 20 bp 3' from 
the first PstI site). The p1 38-related Sstl-Hindlll frag- 
ment was inserted into Sstl/Hindlll cleaved pUC12 (DSM3422) 
J. Messing, "New M13 vectors for cloning", in Methods of 

Enzymology Vol. 101, Part C, R. Wu, L. GroBmann and 

K. Moldoave (eds.), Acad. Press, New York, 1983, 20-78). 



0173254 



- 56 - 

Then the resulting 

recombinant plasmid was digested with EcoRI and Pstl. 

The obtained 600 bp fragment was inserted into plasmid pUC8. 

The 5* -Pstl site was now replaced by an SstI site 

and thus the reading frame is reconstituted at the 

3'- and the 5* -end of the insert (Fig. 12a). The resulting 

recombinant plasmid pUC601 still expresses a stable 

product (Fig. 13). 

Between the Pstl and Hindlll site at the 3'- terminus of 
the EBV-encoded sequence a synthetic oligonucleotide 
obtained according to known methods coding in frauie for 
5 arginine and 2 stop codons was inserted as shown in 
Fig. 12b). The resulting plasmid pUCARG601 encodes the 
P600 region of p138 fused at its C-terminus to 5 ar- 
ginine residues. 

In a last step the Pstl fragment encoding the P540 region of 
p138 was ligated to the Pstl fragment encoding the 
P600 region of pi 38 after digestion with Pstl. The 
resulting recombinant plasmid pUCARG1140 encodes a 
stable protein of about 43kd which contains two 
antigenic sites of p138 fused in frame. In this fusion 
protein the protein region P600 stabilizes the protein 
region P540 (Fig. 13) . The arginine residues at the 
carfoaxyterminus of the expression product may be used for 
the purification of the resulting fusion protein as 
described by Sassenfeld and Brewer (supra) (Fig. 16). 

Example 8 

Construction of the recombinant plasmid pUCARG680 

From the plasmid pUCARG1140 a modified version was con- 
structed which lacks 435bp of the pi 38 encoding region, 
the C- terminal part of the p600 fragment and the N- ter- 
minal part of the p540 fragment. The main antigenic 
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sites predicted by the computer program are still present. 
The plasmid was designated as pUCARG680 and its con- 
struction was achieved by digesting pUCARG1140 with Ncol 
(cleavage site coresp onds to bp1841 and bp3243 in 
Fig* 3) . Since the reading frames in the p600 Ncol site 
and the p540 Ncol site do not fit, the sticky ends were 
removed with S1 -Nuclease. 

30 ug of pUCARG1140 were digested with Ncol, the 3.3kb 
vector-p138 fragment was separated by gelelecrophoresis 
and purified. 5ug of this DNA fragment were digested 
with 100 units S1-Nuclease for 15 min at roomtemperature 
in 100 jxl containing 33mM Na-acetate, 50 mM NaCl, 
0.03mM ZnS0 4 , pH 4.5. The digest was stopped by phenol 
extraction. After precipitation with ethanol the DNA 
15 was religated with T4-DNA ligase and used to transform 
competent E.coli K12 JM109 cells. The resulting clones 

were screended for the appearance of a new protein with 
30kb in size (pUCARG680) . The shortened p600/p540 fusion 
protein encoded by pUCARG680 still reacts as an antigen. 

The newly constructed recombinant plasmid pUCARG680 was 
deposited with the DSM under the deposition number DSM3408. 
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Example 9 

Assay of the antigenicity of the fusion protein encoded 
by plasmid pUCARG1140 



Immunoblots with the fusion proteins encoded by the 
recombinant plasmids pUCARG1140, pUR540, and pUR600 
(examples 6 and 7) using individual NPC-sera reveal 
that the immunological reactions differ in various 
patients (Fig. 14). In this context it has to be under- 
stood that said plasmids encode fusion proteins contai- 
ning the p138 regions P540 + P600, P540, and P600, 
respectively (see Fig. 9) . 
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Whereas in NPC serum no. 352 the main fraction of the 
IgG and IgA antibodies is directed to the P540 region, 
the main fraction in NPC serum no .354 is directed to the 
P600 region of p138. A representative pool prepared frcm 
many sera from NPC patients did not detect additional 
antigenic sites. The conclusion from this finding is that 
the antigenic determinants P540 and P600 as encoded by 
the recombinant plasmids of the present 

invention are necessary and sufficient to achieve the 
desired specificity for ELISA tests useful for diagnostic 
purposes . 

Example 1 0 

Application of plasmid. pUCARG1140 encoded fusion protein 
for the detection of NPC in ELISA tests 

The purified fusion protein encoded by pUCARG1140 was 
coated on micro-titer plates. Ten individual NPC-sera 
were tested for their IgG and especially for their 
IgA reactivity. The IgA-anti-EA titer of these sera was 
previously determined in conventional immunofluorescence 
tests. The highest titer found was 1:80. In the ELISA 
test shown in Fig. 1 5, two EBV-negative , one NPC-serum 

pool and ten individual NPC-sera were tested up to a 
dilution 1:10640. The test was performed according to 
the usual ELISA protocol. Bound antibodies were detected 
with peroxidase conjugated mouse anti human IgG, i.e. 
IgA and peroxidase reaction. All NPC sera show a reaction 
with the coated antigen (up to 1:2560 in IgA) and no 
background reaction could be observed in the negative 
controls. This result indicates that the pUCARG1140 
encoded expression product is suitable for the diagnosis 
and early detection of NPC. 
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Example IT 

Clonincr of a subr egion of the gene codin g for 
the vector pUC8 



350 in 



The coding region of gp 250 and gp 350 was mapped to the 
Bam L-fragment (J. Skare et al., supra) of the EBV B95-8 gencne 
As both polypeptides share identical regions it was supposed that 
both proteins are encoded by overlapping reading frames 
(M. Hummel, D. Thorley-Lawson , E. Kieff, "Epstein-3arr 
virus DNA fragment encodes messages for the two major" 
envelope glycoproteins (gp 350/300 and gp 220/200)% j. 
of Virol. 49, p. 413 (1984)). The sequence data of 
Baer et al. (supra) revealed a large open reading frame in- 
cluding a donor splice site and an acceptor splice site in said 
Bam L-f raiment of the virus genome (Fig. 17 ^ 18K 
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It is assumed that gp 350 is the translation product 
of the unspl iced mRNA transcribed from this region and 
gp 250 is a product of the corresponding spliced mRNA 
(Fig.17). Since both products are found in the viral 
capsids it is assumed that a differential splicing of 
said mRNA. in a manner . comparable with the immunoglobulin 

heavy chain genes (T . Honjo, " "Immunoglobulin genes" , 
Ann. Rev. of Immunol. .1 , p*. 499 (1983)) takes place/ 
During this splicing 630 bp of. the mRNA coding forap 350 
are removed to yield the gp 250 coding mRNA (Fig. 17 and 
27 (dotted lines)) (R. Baer et al., supra). 

Therefore the whole or a part of the reading frame of 
gp 350 was cloned for finally isolating and producing 
a gp 3 50 related product. It should be kept in mind, that 
not only gp 2 50 but also gp 350 are highly glycosylated 
proteins. In contrast, the proteins produced by expression 

of the recombinant DNA molecules according to the present 
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1 invention differ from the respective viral proteins 
normally occuring in nature. If expression is carried 
out in prokaryotes unmodified proteins are obtained 
whereas expression in eukarybtes gives proteins with 

5 different patterns of glycosylation or else modifications 
as compared to the natural product. 

The Bam L-fragment was introduced in pBR322, and E. coli 
K 12 HB 101 was transformed with the recombinant plasmid 
10 obtained. (J. Skare, et al., supra) 

Instead of the host E.coli K12 HB101 the host bacteria 
used in the present invention can also be used. 



15 The contents of the publications of M. Hummel et al. (supra) , 
j. R. North et al. (J. R. North, A. J. Morgan, J. L - 
Thompson, M. A. Epstein, "Purified Epstein-Barr virus 
Mj. 340 .000 glycoprotein induces potent virus -neutralizing 
antibodies when incorporated in liposomes Proc. Natl. 
20 Acad. Set. USA 79, p. 7504- (1982)) and D. A. Thorley- 
Lawson and C. A. Poodry ("Identification and Isolation 
of the Main Component (gp350-gp220) of Epstein-3arr Virus 
Responsible for Generating Neutralizing Antibodies In Vivo", 
J. Virol. 43, p. 730 (1984) do not permit predictions that 
25 subregions of the gp 250/350 encoding sequence are coding 
for sufficiently antigenic and/or immunogenic proteins 
and" that these products "after - selective introduction of 
these subregions- can be* stably expressed in prokaryotic 
and eukaryotic cells. It is therefore surprising that 
30 completely unmodified or in a different way modified 
gp 250/350 related proteins of the present invention 
are sufficiently active .antigens and/or immunogens. 
In particular in previous publications it was not ex- 
cluded that minor carbohydrate residues of the protein 
35 contribute significantly to the antigenic or immunogenic 
potential of this protein. 
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1 As shown in Fig. 17, a 1 .9 kb Pstl-Pstl-fragment of the 

Bam L-fragment (Hummel et al., supra) contains the part 
of the gp 3 50 coding region beginning at aminoacid 
position 232 and ending at aminoacid position 825. 

5 

A large scale preparation of the pBR 322-BamL plasmid DNA 
was done according to the method publisiied by H-C. Bim- 
bo im and J. Doiv ("A ran id alkaline extraction procedure 

• m mm 

for screeninc recombinant plasmid DNA", Nucl . Acias Res. 7, 
10 " 

p. i S 1 2 (19 79)). 50ug of this DNA were digested for 2 

hours at 37°C with 100 units PstI (Boehringer) in 50 

mM NaCi, 10 mM MgCl-, 1 mM DTT , 10 mM Tris-HCl, pH 7.5. 

The digestion was stopped by addition of 1/5 vol. 50 mM 

EDTA, 60 % sucrose, 2 % bromphenolblue . The resulting 

15 

solution was electrophoretically separated on a 1 % 
agarose gel (Seakem, FMC) in Tr is -acetate buffer (0.04 M 
Tris-acetate , 2 mM EDTA, pH 7.6). As a size marker HindXII 
digested -phage DNA (Boehringer) was used. After the 
^ctroohorssis at 40 V for 14 hours at room temoerature 

20 

(RT) , the gel was stained in Tris-acetate buffer containing 
0.5 ug/ml ethidium bromide. 

The DNA bands in the gel were visualized by UV-illumina- 
25 tion and the 1 . 9 kb Pstl-PstI fragment was isolated as 

described in example 2. 

PstI digested DNA of the vector pCJC8 was prepared as 
30 described before, except that for inhibition of reiigation 

of the vector during the following ligase reaction, the 
DNA was treated with alkaline phosphatase (0.5 units 
(Boehringer) , 30 min at 37°C) ♦ 

35 The concentration of the purified fragments was estimated 

by electrophoresing 1 jil each in parallel with 100 ng and 
500 ng of p(JC8— DNA (under conditions described above) . 
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400 ng of the 1.9 kb Pstl-Pstl-fragment and 100 ng of the 
PstI digested vector DNA were ligated, E.coli K12 JM109 
was transformed with the ligated plasmid DNA and positive 
clones were identified as described in example 2. 

The obtained clone was designated E. coli K1 2 JM109 
pUCLP1.9 and the resulting recombinant plasmid pUCLP1.9, 
respectively . 

Example 12 

Cloning of a subregion of the gene coding for gp 350 in 
the vector pCJR290 

For the expression of- a stable product of the gp 350 
subregion said 1.9 kb Pstl-?stl-f ragment was reclon 
in the vector pUR290 (DSM3417) (Fig. -20) (U.ROther et al. , infra). 
The resulting recombinant plasmid is coding for a fusion 
protein of an aminoterminal region of the S-galactosidase, 
followed by the aminoacids 232 to 825 of gp 350 and 
aminoacids coded by the cloning-site of pUR290 and pBR322 
nucleotide residues. The respective aminoacid se- 
quence is given in figure 21 . 

50 jig DNA of the plasmid pUCLP 1.9 were digested with 
100 units BamHI and Hindlll and separated on a 1 % agarose 
gel as described above. The resulting 1.9 k±> 3amHI/HindIZI 
fragment that contains only a few more nucleotides than 
the Pstl-Pstl-f ragment originally introduced into pCJC8 
was separated from the other resulting fragments on a 1 % 
agarose gel (as described above) . Finally it was isolated 
from the gel as described above and ligated into BamHI/ 
Hindlll digested DNA of the vector pCJR290 (U. Ruther, 
B. Muller-Hill "Easy identification of cDNA clones", 
EMBO Journal 1o, p. 1791 (1983)) according to the methods 
described above. 
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The next step was the transformation of the fl-galactosi- 
dase repress or -protein overproducer strain 

E. coli K12-JM109 with these recombinant 

DNA molecules. The transformants were plated and analysed 
as described above, except that the aliquots of the DNA 
preparations were digested with BamHI/Hi ndlll and EcoRI. 
The resulting clone, E.coliK12 JM109 pCJRLP 1.9 carries 

the plasmid pURLPl.9, that is a recombinant: of said 
3amHI-rIindIII 1,9 kb fragment of the plasmid pUCLP1.9 and 
the vector pUR290 (see Fig. 20) . 

Example 1 3 

go 350 related oolvoebtides svhthesized bv E. coli K12 
JM109 PURLP1 . 9 

* 

In an overnight culture E. coli K12 JM109 pURLPl.9 
was grown at 37°C in 5ml L-broth supplemented with 50 p.g/ 
ml Ampicillin. The culture was then diluted to an optical 
density at 560 nm(OD 560 )of 0.4, and 4 ml of this bacteria 
suspension were incubated at 37°C until an OD^ n of 0,8. 

The expression of the genetic information carried by 
plasmid pURLP1.9 was then induced as described in 
example 3 and finally the proteins were visualized by 
coomassie-blue staining as described in example 3. 



Xn comparison with the control experiment, several new 
proteins, encode d by the plasmid pURLP1.9 and ranging in 
size from 116 kD to 200 JcD, were detected (Fig. 22). The 
different size of the expression products may be due to 
incomplete mRNA synthesis or translation. To prove that 
the new proteins are EBV-related products, all the 
electrophoretically separated proteins were transferred 
to a nitrocellulose filter, i.e. a "Western-blot" was 

prepared according to the method described in example 3 . 

The results of this experiment are shown in Fig. 22. 
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Example 1 4 



n 



Purification of the 8-gal : op 350 fusion protein encoded 
by the plasmid pCJRLPt.9 

The replacement of the natural carboxy terminal amino 
acid sequence of the 3-galactosidase by a gp 350 related 
amino acid sequence prevents the formation of 3-galactosi- 
dase tetramers . Furthermore the newly expressed fusion 
protein is present in a high concentration in the bacterial 
cell. Therefore the fusion protein precipitates in the 
cytoplasm of the host cell. 

According to the method described in example 4 the clone 
E. coli K12 JM10 9 pURLP 1 . 9 was used for the production 
of the corresponding fusion protein. 

The results of the several stages of this purification 
procedure are shown in Fig. 23. 

Example 1 5 

Expression of selected antigenic epitopes of gp250/350 
as fl-galactosidase fusionproteins 



Fig. 24 shows the computer-predicted secondary structure 
of gp350 together with the relative values of hydrophilic 
(dark circles) and hydrophobic (grey circles) areas, B-turns 
or loop structures are indicated as line turns of 180° 
30 (oC -helices, fi-sheet and coil structures are barely dis- 

cernable in the scale used) . Based on the assumption 
that antigenic sites are mainly located in 6- turns in 
an hydrophilic environment, which may be exposes to the 
surface of the protein, the regions at about aminoacid 50 

35 and aminoacid 740 and 800-830, respectively, are expected to represent 

antigenic epitopes . 
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Subcloning and expression of the N-terminus of gp250/350 

The EBV BamHI-L fragment which was cloned in pBR322 (see J. 
Skare et al., supra) was digested with EcoRI (restriction sites at 
positions 650 and 1284 in the sequence given in Fig- 17), 
the resulting 634 bp fragment was eluted frcm an agarose gel 

after electrophoresis and ligated to EcoRI linearised pUCl 9 

et al. 

(DSM3425) (Yanisch-Perron" / Gene 33, 103-119 (1985)). Then, 
E.coli K12 JM109 was transformed with the ligation pro- 
ducts (all steps were carried out as described in Exanple 2). 
According to example 2 the recombinant plasmids obtained 
tote tested': for the orientation Of their insert- using suitable re- 
striction enzymes. A reconbinant plasmid carrying the insert in. the 
opposite orientation of the reading frame relative *tb the reading 
frame of the lacZ gene of the pUC19 plasmid was designa- 
ted as pUC19LEP600 and used for further cloning: 




pUC19LEP600 was digested with BamHI and PstI (the BamHI 
site is derived from pUC19, the PstI site corresponds 

to position 1248 in Fig. 17), the resulting 600bp 
fragment was inserted into pUR291 (DSM3418 ) (Riither, supra) , pre- 
viously digested with BamHI and PstI. The resulting * 
recombinant plasmid pURLEP600 displayed the following 
sequence in its linker region at the C-terminus of 
the B-galactosidase : 



0173254 



- 66 - 



PUR291 / PUC19 / gp250/350 

B-gal-TGT CGG GGA TCC CCG GTA CCG GAG CTC GAA TTC CCA TTT ACC 

/ pUR291 

TGC AGC CAA GCT TAT CGA TGA 



The expression of the fusion protein from this recom- 
binant plasmid after IPTG- induction was carried out 
as described in example 3. The result of this experiment 
is shown in Fig. 25. From Fig. 25 (lower part) it can 
be taken that the expression product obtained is 
recognized as a moderately antigenic protein by a pool 
of NPC-sera. 

Subcloning and expression of the C-terminus of gp250/350 

The region covering the antigenic epitopes near the 
C-terminus which, according to the computer-directed 
analysis, also is expected to be antigenic, was isolated 

by digesting the plasmid pUCLP1.9 (see example 11) with 

XmnI (restriction site at position 2760 in Fig. 17) and 

Hindlll (restriction site in the region derived from the 

pUC-plasmid) . The purified 386 bp fragment was inserted 

into pUC19 previously digested with Hindi and Hindlll. 

The resulting plasmid which was introduced into E.coli 
K12 JM109 is pUC1 9LXP390 : 




J 
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purified according to the method given in example 4. An 
antibody reaction to all three fusion proteins indicates 
a normal immune status. I'f there -is nor or weaker reaction with the 
proteins encoded by pURLEP600 ana ?URLXP390, but reacti- 
vity against pURLP1.9 (which contains the intron se- 
quences, see Fig. 27) a chronic EBV-inf ection is very 
likely. 

IgA antibodies to the membrane protein gp250/350 and 
to subfragments thereof are absent in the normal popula- 
tion, but present in 58 % of Nasopharyngeal Carcinoma 
patients when measured in a relatively insensitive 
Immunofluoresce n ce assay. These results are similar to 
the detection rate of IgA antibodies to EBV specific 
early antigens in comparable testsystems. In analogy 
the more sensitive ELISA test brings the detection rate 
close to 100 % with only minimal increase of false posi- 
tive results. Therefore the antigens encoded by the 
newly constructed recombinant plasmids pURLEP600, 
pURLXP390, and pURLPI . 9 / respectively , are valuable 
substances for the initial diagnosis and the control 
of a therapy of Nasopharyngeal Carcinoma. 

Example 1 7 

Expressi on of the N-terminal crp250/350 fragment in the 
plasmid pUC8 

The recombinant plasmid covering the N-terminal region 
of gp250/350, pUC1 9LEP600 (see example 15), was digested 
with BamHI and Pstl. The EBV derived fragment was iso- 
lated and ligated into pUC8, previously digested with the 
same enzymes. The sequence in the linker region of the 
resulting clone, pUCLEPSOO, is the following: 
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1 The insert of pUC19LXP390 was cut out with BamHI and 

Hindlll and ligated into pUR288 digested with the same 
enzymes. The resulting recombinant plasmid was intro- 
duced into E.coli K12 JM109 and was designated as 

5 pURLXP390. The sequence in its . linker region is as 
follows: 

PUR288 / PUC19 / gp350 / pUC8 / 

&-gal-TGT CGG GGA TCC TCT AGA GTC AGT TCC CAC GTA CTG CAG CCA AGC 

10- 

pUR288 
TTA TCG 



15 After IPTG-induction a B-galactosidase fusion protein 
was synthesized by said transformed host. In a Western 
blot the expression product shows a high reactivity with 
the NPC sera pool (see Fig. 25 , lower part^ 

20 Example 1 6 

Use of the B-cyal: ;gp250/350 fusion proteins encoded 
by the newly constructed recombinant plasmids in dia- 
gnostic tests 



25 



30 



35 
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Jilg et al. (W.Jilg and H. Wolf, "Diagnostic Significance of 
Antibodies to the Epstein-Barr Virus-Specific Membrane Antigen 
gp250", The Journal of Infectious Diseases, 1 52 , 222-225(1985)) 
have shown the validity of gp250 

and gp350 as antigens for the determination of the iranune„ 
status to EBV and especially for the diagnosis of chronic 
EBV-infections. Persons showing a normal immune response 
after an EBV- infection possess antibodies against gp250 
and gp350, whereas patients suffering from chronic 
EBV-infection show an immune response only to gp350 which 
still contains the additional intron sequence (see Fig* 
27) . The serological status of these persons can be 
checked in ELISA tests using the three fusion proteins, 
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PUC8 / pUC19 . 

/ 

ATT ACG AAT TCC CGG GGA TCC CCG 

PUC8 

TTT ACC TGC AGC CAA GCT TAT 



/gp350 

GGT ACC; GAG CTC GAA TTC CCA 



10 After induction with IPTG^the fusion protein encoded 

by pUCLEP600 is quite stable in the bacterial cells and 
is recognized as an antigen by the NPC sera pool (see 
Fig. 26). The bacterial fusion part consists of 14 amino- 
acids at the N-terminus and 9 at the C-terminus. The 

15 value of this protein is its applicability in a vaccine/ 
especially when it is fused with the per se instable 
second antigenic region from the C-terminus as it was 
determined with the B-gal fusion proteins (see example 15) 

20 The inserts of the recombinant expression plasmids and 
cloning plasmids constructed according to examples 11 
and 15 to 17 are summarised in Fig. 27. 

Example 1 8 

25 

Expression of the N-terminal part of gp250/350 as a 
p1 38 : :gp250/350 fusion protein 

The plasmid pUC19LEP600 (see example 15) was digested 
30 with PstI and the resulting 600bp fragment was ligated 

to the PstI linearised plasmid pUCARG601 (see example 7) . 
The gp3 50- insert was checked to be in the same orien- 
tation as the pUCARG6 01 -reading frame and the resulting 
recombinant plasmid was designated as pUCARG1230. The 
sequence in the linker region and at the junction sites 
of the obtained plasmids is the following: 



35 
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1 

PUC8 / pUC12 / p138 / pUC19 

ATG ACC AT6 ATT ACG AAT TCG AGC TCT CTG ACC ATC CTG CAG GTC 6AC 

TCT AGA 

/ gp350 / pUCARG601 

GGA TCC CCG GGT ACC GAG CTC GAA TTC CCA TTT ACC TGC AGC GTC GTC 

GTC GTC GTT GAT AAC GTT 



After induction with IPTG, E.coli K12 JM109 carrying 
pUCARG1230 expresses a stable and antigenic protein 
which consists of antigenic regions from two different 
15 proteins, namely p138 and gp250/350 (see Fig. 26). 



Furthermore it can be used as antigen in ELISA tests and 
20 also for vaccination. 



Example 1 9 

Neutralisation test with sera derived from rabbits 
25 immunized with gp25 0/350 antigens 

Supernatants from B95-8 cells were used to immortalize 
human nmKiHrgti cord blood cells (Lymphocyte fraction frcm 
Ficol/Hypaque gradient). 0,5 x 10 lymphocytes were 

30 seeded per 0,5 ml microtiter plate well and 50 nl of a 
cell-free supernatant of B95-8 cells were added and 
allowed to adsorb for 2 hours at 37 °C. After incubation 
the virus-containing medium was removed, cells were 
washed with RPMI1640 medium containing 10 % fetal calf 

35 serum and incubated in 200 nl of the same medium at 37 °C 
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in a 5 % C0 2 atmosphere. Developing colonies of 
lymphoblastoid cells were evaluated not sooner than 
three weeks after the start of the experiment and 
counted as positive transformation. 



10 



.15 



The neutralizing properties of sera were tested by pre- 
incubating for 1 hour under slight agitation aliquotes 
of the Epstein— Barr Virus containing B95-8 cell super- 
natant with 20 jil of test serum including the respective 
preimmunization serum as control in a replicate test before 
the supernatant was allowed to adsorb to the umbilical 
cord blood lymphocytes. After removing the inoculum from 
the cells after 2 hours the maintenance medium (RPMI1640 
supplemented with 10 % FCS) was supplemented with 5 % 
of the respective sera under test for neutralizing 
activity. The following results were obtained: 



20 



PBS 

(control) 
(no virus) 



Virus 



EBV nega- 
tive 
human 
rum 



25 no colo- 
nies 



colonies 



Virus 



- EBV posi- 
tive 

human se- 
run pool 



no colo- 
nies 



Virus 



rabbit 
preserum 



colonies 



Virus 



rabbit 
immune 
serua 1 



Virus 



rabbit 
Imnune 

-serum 2 



no colo- no colonies 
nies 
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Example 20 

Cloning and expression of antigenic fragments of the virus - 
capsid protein p150 

The coding sequence of the diagnostically relevant protein 
p 150 (Virus capsid antigen VGA (see Example 1 f Fig. 28D) 
was examined for antigenic sites and subcloned for the ex- 
pression as B-galactosidase fusion proteins. The N-terminal 
region which is expected to encode an antigenic site was 
obtained by digestion of the Charon 4 A phage EB 69-79 
(G.N, Buell, D. Reisman, C. Kintner, G. Crouse, and B. Sugden, 
"Cloning overlapping DNA fragments from the B95-8 strain of 
Epstein-Barr virus (ATCC CRL 1612) reveals a site of homology 
to the internal repetition", Journal of Virology 40, 977- 
982 (1981)) with BamHI and a resulting 1176bp fragment was 
cloned into the BamHI site of pUC12. From a resulting plasmid 
with the insertion in the proper orientation a 580bp fragment 
was excised with Xhol/Sall. The Sail site derives from the 
pUC12 linker, the Xhol site is located 33bp upstream from 
the start of p150. This fragment was inserted into pUC8 di- 
gested with Sail (Sail and Xhol share the same sticky end 
sequence) . The resulting clones were screened to have the 
p150 start codon next to the BamHI site. From a proper clone 
the pi 50 encoding region was cut out with BamHI and Hindlll 
and cloned into pUR2 90 digested with BamHI and Hindlll 
(pUR290CXH580) . The expression of the B-Gal::p150 fusion 
protein from this clone is shown in Figure 29. Its ability 
to react very well with a NPC serum pool can be taken from 
Figure 30. 

Further, p150::B-gal fusion contructs were obtained according- 
ly. For example the subclones 

PDR290DBX320, pUR292DBB1 80 , pUR290DTT 700, pURDTT740, 
PUR290DTP680, pUR288DPP320 which are indicated in Figure 31. 
From the designation of the subclones, the vector used can be 
taken, e.g. for the construction of subclone pUR290DBX320 
the vector pUR290 was used. 
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From Figure 30 the restriction enzyme sites used for subclo- 
ning can also be taken. All clones with the exception of 
pURDBB180 were constructed by subcloning the desired 
fragments into pUC8 or pUC12 (see supra) to obtain 
BamHI and Hindlll sites suitable for the cloning to 
pUR vectors (see snpra) . pUR292DBB180 was derived by 
insertion of the 180bp Bglll-BglH fragment (see Fig, 31) into 
pUR292 linearized with BamHI. Figures 29 and 30 
show their expression and antigenicity. 
The B-gal::p150 fusion protein encoded by pUR290CXH580 and 
purified according to example 4 reacts in the ELISA test 
as an EBV specific antigen indicating its applicability 
in diagnosis. Stable expression 

was also obtained with the N-terminal fragment of p15Q by 

inserting the 580bp fragment (used for the construction 

(DSM 3424) 

of pUR290CSH580) into pU.C1 8 Aising the BamHI and Hindlll 
site. The resulting clone pUC18CXH580 expresses a stable 
and antigenic protein of about 25kD in size. 
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The following deposited plasmids, host bacteria and cell 
lines were used for the purpose of the present invention. 
The deposition was affected according to the Budapest 
treaty 



m. o . 



depository 



deposition number 



B 95,8 
E.coli K12 JM83 
E.coli K12 BMH71-18 
E.coli K12 JM109 

0UC8 

pUC9 

pUC12 

pUC19 

pUR288 

pUR290 

pUR291 

pUCARG680 



ATCC 
ATCC 
DSM 
DSM 
DSM 
DSM 
DSM 
DSM 
DSM 
DSM 
DSM 
DSM 
DSM 



CRL 1612 

35607 

3413 

3423 

3420 

3421 

3422 

3425 

3415 

3417 
3418 

3408 
3424 



pUC18 

while we have hereinbefore presented a number of 
embodiments of this invention, it is apparent that our 
constructions can be altera to provide other embodiments 
which utilize DNA sequences of the E3 V genome coding 

and for producing recombinant 
DNA molecules . It is obvious to those skilled in the art 
that other DNA sequences may also be used, which are 
related to said DNA sequences and which may be derived 
from other EBV serotypes. The EBV is easily obtainable 
from known natural sources, e. g. from the saliva of 
infected patients - 



It is obvious that for obtaining biologically 
comparable results other suitable vector/host systems 
can be used. The invention ist not limited to host /vector 
systems presently available. 
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Claims : 



1. A DNA sequence of the EBV-genome, characterized in 
that it corresponds to at least a part of an EBV-re- 
lated antigenic protein having an aminoacid sequence 
as shown in Figures 3, 17, and 28. 

2. A DNA sequence according to claim 1, characterized 
in that it corresponds to a t least a part 

of .protein p150f p143/ pl38/ p11Q ^ p1Q5 ^ 

p90, p80, p54 or gp250/350. 

3. A DNA sequence according to claims 1 or 2,1 
characterized in that it contains additionally the 
respective regulatory sequences in the 5 ' and 3 1 
flanks. 

i . A DNA sequence hybridizing to a DNA sequence accor- 
ding to anyone of claims 1 to 3 from whatever source 
obtained including natural, synthetic or semisynthe- 
tic sources , which is related by mutations , including 
nucleotide substitutions, nucleotide deletions, 
nucleotide insertions and inversions of nucleotide 
stretches to a DNA sequence according to claims 1 
to 3 and which encodes at least a part of a protein 
according to claim 1 . 

> . A DNA sequence according to claim 2 , characterized 

in that it is inserted in the recombinant plasmid 
pUC6130, pUC635, pUCP400, pUCP380, pUCP600, pUCP210, 
pUCP750, pUCP540, pUCHP, pUC924, pMF924, pKK378, 
pUR600, pUR540, pUCARG680 or pUCARG1140. 

> . A DNA sequence according to claim 2 , characterized 

in that it is inserted in the recombinant plasmid 
pUCLPl.9, pURLP1.9 f pUC19LEP600, p(JC19LXP390, 
DURLXP390, pUCARG1230,pUCLEP600, pUCLXP390 and pURLEP600. 

-I 
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7. A DNA sequence according to claim 2, characterized 
in that it is inserted in the recombinant plasmid 
pUR290CXH580, pUR290DBX320 , pUR292DBB1 80 , pUR290DTT700 , 
PURDTT740, pUR290DTP680 or pUR288DPP320 . 



8 . A DNA sequence characterized in that in contains in 
reading frame at least two regions of a DMA sequence 
of anyone of claims 1 to 4 derived from a single 
EBV genome. 

9. A DNA sequence according to claim 8 , characterized 
in that it contains in reading frame at least two 
regions of a DNA sequence of anyone of claims 1 to 
4 derived from different EBV genomes. 



0 . A DNA sequence according to anyone of claims 1 to 9 , 
characterized in that it contains at its 3' end 
three to fifteen arginine codons positioned in the 
correct reading frame followed by at least one stop 
codon. 



. A DNA sequence according to anyone of claims 1 to 6 , 
characterized in that it contains at its 5' end 
an oligonucleotide encoding an oligopeptide which serves 
in the resulting polypeptide as a cleavage site for a % 
sequence specific protease or which is cleavable by 
acid treatment with an acid such as formic acid. 

A recombinant DNA molecule for cloning, characte- 
rized in that it contains a DNA sequence according 
to anyone of claims 1 to 1 1 . 



A recombinant DNA molecule for expression, 
characterized in that it contains a DNA sequence 
according to anyone of the claims 1 to n that 
is operatively linked to an expresssion control 
sequence. 
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14 

• A recombinant DNA molecule according to claim 1 3 , 
characterized in that the expression control 
sequence is selected from the group of the 
E- coli ^.promoter system, the E. coli lac-system, 
the E. coli B-lactamase system, the E. coli 
trp-system, the E. coli lipoprotein promoter, 
yeasts and other eukaryotic expression control 
sequences . 

15 ♦ Vector carrying a part of the p138 encoding 

DNA sequence the encoded protein of which stabi- 
lizes in a fusion protein a protein encoded by 
a DNA sequence ligated to . its 3 '-end and carrying 
a DNA sequence encoding three to fifteen arginine 
residues followed by at least one stop codon 
which after insertion of the second DNA sequence is 
positioned at the 3'-end of this second sequence 
in the correct reading frame. 

16. Vector according to claim 15 which is pUCARG601 . 

17. a host, characterized in that it is transformed 
by at least one recombinant DNA molecule according 
to anyone of claims 1 2 to 1 4. 

18. A host according to claim 1 7 selected from the 
group consisting of strains of E, coli, other 
bacteria, yeasts, other fungi, animal and human 

cells, 

19. "a protein having EBV-related antigenic determinants 
suitable for diagnosis and therapy of EBV-related 
diseases, characterized in that it is encoded by 

a DNA sequence according to anyone of claims 1 
to 1 1 . 



n 
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20 . A polyantigen having at least two EBV-related 
antigenic determinants suitable for diagnosis 
and therapy of EBV-related diseases, characte- 
rized in that it is encoded by a DNA sequence 
according to anyone of claims 8 and 9 • 



21. A 



fusion protein, characterized in that it con- 
tains a protein according to claims 19 or 20 . 



22 • A diagnostic composition for the detection of 
anti-EBV-antibodies , containing at least one 
protein according to anyone of claims 19 to 21 
in an amount sufficient to bind said anti-EBV- 
15 antibodies in a sample. 

23. A diagnostic composition for the detection of 
EBV-related diseases , containing at least one 
DNA sequence according to anyone of claims 1 to 
20 -J -J in an amount sufficient for hybridization 

to an EBV-related DNA sequence in a sample. 

24- A pharmaceutical composition containing at least 
one protein according to anyone of claims 19 
to 21 in an amount sufficient for stimulating 
in humans the production of antibodies to EBV 
and a pharmaceutically acceptable carrier or 
diluent. 



25 . A method of preventing EBV infection or therapy 
of EBV-related diseases comprising administering 
to a human being the pharmaceutical composition 
according to claim 24 in an amount sufficient 
to induce or to modulate an immunoresponse. 



n 
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Figure 1 ; Autoradiography of an immunoprecipitation of 
EBV- specific sera derived frpm patients suffering from 
mononucleosis and NPC. 




\ NPC I Pool 
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Figure 2 

Mapping of mRNA's relative to the EBV B9 5-8 genome. 
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Figure 3 _ i _ 



GXXXGCGAGGCXGGGCGGCAIGCCAAGAXCGCXGAGACGXCAGXXCCCCGXGACGXGGGC 
1 + ■+* + -r + + ^0 

CCTGGCCAGCCXGACXGACXXCCXGAAAXCTT7GXAAAXGAAXAAACAGTGGGXGXXGCG 
,31 s. + + ^ 4- 120 

TGAXGAGXAAAGXGXAACAXXXAAXGXGGGAC7GGGAGGCCSGGGCGAXACCXXGGGCAX 
121 . * * + + 130 

HqIAI 

CAXGCAGGGXGCACAGACTAGCGAGGAXAAXCXGGGCAGCCAGAGCCAGCCGGGXCCGXG 
131 -r 4- * + 240 

fie tSlnGl yAlaGlnXhr SepGluAspAsnLeuGlySerGlnSerGlnPr oGlyppoCy 

CGGCXACAXCTACTXXXACCCCCXGGCCACCXACCCXCXXAGGGAGGXGGCCACACXGGG 
241 * -r -r + + 300 

sGiyXyr II eXyr PheXyr ProLeuAlaXhr Xyr ppoLeuApgG luVa 1 A laThr LeuGI 

GACCGGCXACGCGGGCCACAGGIGCCXGACGGXGCCGCXCCXXTGCGGCAXCACCGXGGA 
301 + +_ + ^ + + 3^0 

vXhrGiyXyrAiaGiyHisAr gCysLeuIhr ValPraLeuLeuCysGly IleXhrValGl 

GCCGGGCXXCAGCAXCAAXGXCAAGGCXCXGCACAGGAGGCCCGACCCCAACTGCGGGCX 
361 — — _ -r — 420 

uProGlyPheSer IleAsnValLysAlaLeuHisArgArgProAs pProAsnCysGlyLe 

CCXACGCGCXACCTCCXAXCACAGGGACAXCXACGXGXXCCACAAXGCCCAXAXGGXXCC 
421 * + + + + _C + 480 

uLeuAroAl aXhrSer Xvp KisApgAs o IleXyp ValPheHisAsnAlaH isMetUaiPp 

Xhol 

CCCCAXCXXXGAGGGGCCGGGXCXCGAGGCCCXCXGXGGCGAQACCAGGGAGGXGXXXGG 
431 4. + + + + -r 540 

oPr o IlePheGluGly ppoGlyLeuGIuAlaLetjCysGlyGIuIhrArgGluVa IPheGl 

GXACGACGCCXACAGCGCCCXACCGAGGG AAAGCXCCAAGCCGGGGGACXXCXTCCCCGA 
541 + -i- + + + + 500 

yXypAspAl aXyrSer A 1 aLeuProAr oGluSepSerLysPpoG lyAs pPhePhePpoGl 
AGGGCXAGAXCCCXCXGCCXACCXGGGGGCGGXGGCAAXAACCGAGGCCXXCAAGGAGCG 

GO i -r -r + A ( 3o0 

'jGiyLefjAspProSePAlaXyrLeuGlyAlayalAlalleXhpGluAlaPheLysG 1«jAp 

ACXCXACAGCGGAAACCXGGXGGCC AXXCCAXCGXXAAAACAGGAGGX AGCGGXGGGGCA 
b «S 1 1 + + + + + 720 

3Le«jXyr3epGlyAsnLe<jValAla IleProSepLeuLysGlnGluValA 1-aValGlyGl 

GXCXGCGAGCGXXAGGGXCCCGCXCXACGACAAGGAGGXGXXCCCAGAGGGCGXGCCCCA 
721 •■ h + * + + 730 

nSer A 1 aSep Val ApsVal Pr oLeuXy p As pLy sG 1 a IPhePr oG luG iy Va lProG 1 

GCXCCGCCAGXXXXACAACXCGGACCXCAGCCGCXGCAXGCACGAGGCGCXGXACACCGG 
731 + + + + + + 840 

nLeoArgGlnPheXypAsnSerAspLeuStjr ApgCysHetHisGluAlaLeuXyrXhrGl 

GCXGGCGCAGGCGCXGCGCGXCCGACGGGXGGGCAAGCXGGXGGAGCXGCXGGAGAAGCA 
341 + + + + + + goo 

yLeuA l aGlnAlaLe»jAr3ValAr3AP3ValGlyLysLauValGl»jLeuLe<jGluLvsGl 
Pst I 

GAGCCXGCAGGACCAGGCCAAGGXGGCCAAGGXGGCCCCCCXCAAGGAGXXCCCAGCCXC 
901 + + + + 1- + 960 
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riSerLeuOlnAspGlnALaLysyalAlaLysValAlaPraLeutysQIuPhefroM i-aS* 



AACCAXCAGXCACCCGGACXCGGGaGCCXXAAXGAXXGXGGACAGCGCGGCAXGCGaGCX 
96! + + - + + -r 1020 

rThr IleSerHisProAspSepGlyAlaLeurtet IleValAspS*r AiaAlaCysGiuLe 

GGCGGXGAGCXACGCACCCGCCAXGCXGGAGGCCXCGCACGAGACCCCGilCCAGCCtCAA 
1021 + - + «- - 1030 

uAlaValSerXyrAiaProAlaMetLeuGluAlaSerHisGluXhpProAiaSepLsuAs 
CXACGACXCGXGGCCCCXGXXXGCCGACXGXGAGGGXCCAGAGGCCCGXGXGGCXGCGXX 

1081 + + + + + + 1140 

nXypAspSepXr pProLeuPheAlaAspCysGluGlyProGluAlaArq^alAIaAlaLe 

ACACCGAXAXAAXGCCAGCCXGGCCCCCCACGXGXCCACGCAGAXCXXXGCCACCAAXXC 
1141 + + + -r + + 1200 

uHt sArgXyrftsnAlaSerLeuAlaProHisValSerXhrGlrillePheAlaThr AsnSe 

CGXCCXCXACGXCXCGGGGGXCXCGAAGXCAACCGGXCAGGGCAAGGAGAGICXCXXXAA 
1201 + + + + + + 1260 

rValLeuIyrValSerGlyValSerLysSerXhrGlyGlnGlyLysGluSerLeuP'neAs 

Pst I 

CAGXXXCXACAXGACCCACGGCCXGGGGACCCXGCAGGAGGGGACCXGGGACCCCXGCCG 
1261 + + «*• + r -r 1220 

nS*rPheXyrHetXhrHisGlyLeuGlyThrLeuGlnGluGiyIhrXr pAs pProCysAr 

CCGACCCXGCXXCXCGGGCXGGGGXGGGCCAGACGXGACCGGAACCAACGGTCCGGGAAA 

1321 + r — + ' + 1380 

3ArgPraCysPheSerGlyXr pGlyGlyProAspValXhrGlyThr AsnGly ProGIyAs 

CXACGCXGTGGAGCACCXGGXCXAXGCGGCCXCCXXCXCGCCCAACCXICXXGCCCGCXA 
1331 + + -r- + h + 1440 

nXyr AlaValGluH isLeuValXyr AlaAlaSepPheSerProAsnLeuLeuAlaArqXv 

Psti sstr 

XGCCXACXACCXGCAGXXXXGCCAGGGACAGAAGAGCXCXCXGACCCCGGXGCCGGAGAC 
1441 + + + + + -r 1500 

rAlaXyrXyrLeuGlnPheCysGlnGlyGlnLysSerSerLeuXhrProValProGluIh 

GGGCAGCXACGXGGCGGGGGCGGCCGCCAGXCCCAXGXGCXCGCXCXGCGAGGGCCGGGC 
1501 + + + + * 1560 

rGlySerXyrValAlaG lyAlaAl3AlaSer Pr oHe tCysSsr LeuCysGiuGlyAr oA 1 
CCCGGCCGXGXGCCXGAACACGCXCXXCXXXAGGCXGAGGGACCGCXXCCCCCCCGXCAX 

1361 ► + + !• + -r 1620 

aProAlaVa lCysLeuAsriXhpLeuPhePheAr3Le«jAP3AspArgPhePr oProValrte 

GXCCACGCAGCGGAGGG ACCCCXAXGXGAXCXCGGGGGCCXCGGGC7CCXACAACGAGAC 

1621 + + + -r + + 1680 

t Ser Thr G 1 nA p oAr 3 As pProIyrVal IleSerG lyAlaSepGlySep Xy p AsnG iuXh 

GGACXXXXXGGGCAACXXXCXCAAGXXCAXCGAXA AGGAGGACGACGGGC AGCGGCCGGA 

1631 + + + -r 1740 

p AspPheLeuG lyA^nPheLeuAsnPhe IleAspLysGluAspAapGlyGl nApgPpoAs 

CGACGAGCCCCGCXACACCXACXGGCAGCXGAACCAGAACCXGCXGGAGCGGCXGXCXCG 
1741 + + + +- + + 1800 

pAspGluPr oArgXypXhr TypXp pGlnLeuAsriGlnAsnLeuLeuGluAr gLauSer Ar 

GCXGGGC AXAGACGCXGAAGGAAAGCXAGAGAAGGAGCCCC AXGGCCCGCGXGACXXTGX 

1901 + + * -r + 1360 

3LeuGly IlsAs pAlaGluGly LysLeijGluLysGluPpoHisGly ppo Ap oAs pPhe 
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CAAGAXGTICAAGGACGXGGAXGCGGCGGXGGACGCCGArtGXGGXCCAGXXXAXGAACAG 
136i + + + + + + 1'520 

lLy^rtetPheLysAspValAspAlaAlaValAspAlaGluVaiValGlnPheMetAsnSe 

CAXGGCCAAGAACAACATCACCTACAAGGACCTGGXCAAGAGCXGCX ACCACGXGAXGCA 

132L -f- + -r + + 1980 

rrfe tALai.vsAsnAsn II*?XhrXyr L y s A s pLeuVa 1 LysSerCys Xyr HisVa I Met 51 



SXACXCGXGCAACCCCXXXGCSCAGCCCGCCTGCCCCAXCXXCACCCAGCXGXXXXAeCG 



1981 +■ 



2040 



nXyr SerCysAsnProPheAlaGlnPr oAlaCysPra IlePheXhr GlnLeuPheXyr Ar 

Pst I 

CXC ACXGCXG ACC AXCCTGCAGG AC AXCXCCCXGCCCAXCXGXAXGXGCX AX GAG A ATG A 

2041 -r -i -i- -r 4* 4- 2100 

aSerLeuLeuIhr HeLeuGInAs p IleSer LeuPro II eCysrtetCysXyr GluAsnAs 
CA ACCCCGGGCXXGGCCAGAGCCCCCCAGAGXGGCXAAAGGGXC ACTACCAGACGCXGXG 

2101 + -r -r + + + 2160 

pAsnPr oGlyLeuGlyGlnSerProProGluXr pLeuLysGlyHisXyrGlnXhr LeuCy 

CACCAACXXXAGGAGCC7GGCCAXCGACAAGGGGGXCCXCACGGCCAAGGAG6CCAAGGX 

21&1 *• j- + + + 2220 

sXhrAsnPheA^SerLeuAlalleAspLysGlyValLeuXhr AiaLysGiuAlaLysVa 

Pstl , * 

GGXGCAXGGGGAGCCCACCTGCGACCXGCCAGACCXGGACGCGGCCCXGCAGGGCCGGGX 

2221 + + + + + + 2230 

IValHisGlyGluProXhrCysAspLeuProAspLeuAspAiaAlaLeuGlnQlyA^Va 

GXACGGCCGGCGGCXGCCXGXGCGCAXGXCCAAGGXGCXGAXGCXGXGCCCCAGGAACAI 

2281 -r -r + +> + 2340 

IXyrGlyArgAraLeuProV alArsMetSer Ly^ValLeuHetLeuCysPraArgAsn II 

C A AG AXCAAG AACAGGGXGGXCXXC ACGGGGGAG AATGCCGCCCXCCAGAACAGCXXCAX 

2341 + + + + + + 2400 

eLys IleLysAsnArsWiValPheXhrGlyGluAsnAlaAl aLeuGlnAsnSerPhe II 

CAAGXCCACXACCAGGhGGGAGAaCXACAXCAXCAACGGGCCCXACAXGAAAXXCCXCAA 

2401 +> h + * = -i + 2460 

eLy-sSer Xhr Xhr ArgAr 3G luAsnXyr He IleAsnGly ProXyr HetLysPheLeuAs 

CACCXACCACAAGACCCXAXXCCCGGACACXAAGCXCXCAAGCCXGXACCXGXGGCACAA 

2461 ^ + + -r ^ 2520 

nXhrXyrHisLysXhrLeuPheProAspXhr LysLeuSerSerLeuXyr LeuXr pHisAs 

CXXXXCCAGGCGGCGCXCGGXCCCXGXCCCCAGCGGGGCCAGCGCGGAGGAGXACXCXGA 

2521 + + + + + + 2530 

nPheSerAr3Ar3Ar3SerValProValPr oSerGlyAlaSerAlaGluGluXyrSer As 

CCXGGCCCXCXXXGXGGACGGGGGCXCCCGGGCCCACGAAGAGAGCAACGXC AX AGAXGX 

2581 + + + + + + 2640 

pLeuAl.aLeuPheV3lAspGlyGlySerAr3AlaHisGluGluSerAsnV.al IleAspVa 

GGXGCCXGGCAACCXGGXCACXXACGCCAAGCAGAGGCXCAACAACGCCAXCCXGAAGGC 

2641 + -1- + + *- + 2700 

lValProGlyAsnLeuValXhrXyr AlaLysGlnAr sLeuAsnAsnAla IleLeuLysAl 

GXGCGGCCAGACCCAGXXCXACAXCAGCCXGAXXCAGGG ACXGGXGCCGAGGACGCAGXC 

2701 - — -— ——4.---—— r- h — -< — + 2760 

aCysGlyG inlhr GlnPheTyr IleSer Leu IleGlnGlyLauValPr oAr 3Xhr GlnSe 
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GGTGCCCGCCCGTGACTACCCCCACGXACTGGGCACGCijG'oCoG il3G^GIC:3GCAaC53i3C 



2761 — — — — — ——4. — — T — 2S20 

CTACGCGGAGGCCACCTCCTCCCIIACTGCGACCACGGTGGTCTGCGCGGCCACAGACTG 

2S21 -r + -r -r + -r 2880 

aTyr AiauluAlaThrSer 3*r LeuThr Al aThrThrValValCysAl aAiaXhr AspCy 

XCXXAGCCAGGXCXGCAAGGCCCG7CCGGXTGXCACGCTGCCAGXGACCAXCAACAAGXA 
2881 + + + -5. 2940 

suauSerGinVaiCysLysAlaArgProValValThr LeuProVallhr IieAsnLysXy 

CACGGG^3GXCAACGGCAACAACCAGAXAXTCCAGGCCGGGAACCXGGGAXACXIXAXGGG 
2«41 + * 3000 

rXhrGlyValAsnGlyAsnAsnGln II ePheGlnA laGl vAsnLeuGly "vr PheMetGl 

CCGGGGCGXGG AC AGG AACCXGCXGCAGGCCCCCGGGGCTGGGCXGCGCA AGC AGGCCGG 
3001 +• + •** + + + 3060 

yArgGlyValAspArgAsnLeuLeuGlnAIaProGlyAIaGlyLeuArgLysGlnAlaGl 

GGGCXCXICCAXGCGGAAGAAGTXXGXCXXXGCCACCCCCACCCXAGGGXTGACCGXGAA 
3061 -i + +• -i- + r 3120 

yGlySer Ser Me tArgLysLysPheVa IPheA laXhr ?r oXhrLsuGlyLauXhr ValLy 

GCGCCGGACCCAAGCCGCGACCACAXAXGAGAXXGAGAACAXCAGGGCXGGCCXGGAGGC 
3121 -r + + -r ■ + 3130 

sArgArgXhrGlnAiaAIaXhrXhr XyrGluIleGluAsnlleArgAlaGlyLeuGluAl 

CAXXAXAXCACAAAAACAGGAGGAAGACXGXGXGXXXGAXGXGGXGXGCAACCXXGXGGA 
3131 + * -r H- -r 3240 

a lie IleSerGlriLysGlriGluGluAspCysValPheAspValValCysAsriLauValAs 

XGCCATGGGCGAGGCAXGCGCCXCGCXGACXAGGGACGACGCGGAGXACXXAXXGGGCCG 
3241 -r ■+* -r + + 3300 

pA 1 aMetGlyG luA laCys AlaSer Leulhr ArgAs pAs pAlaG luXyr LeuLeuGlyAr 

CXXCXCCGXCCXGGCGGACAGCGXCCXAGAAACCCXGGCGACCAXXGCCXCCAGCGGGA7 

3301 + + + + ■+ 3360 

gPheSer ValLeuAlaAspSer ValLeuGluXhr LeuAlaXhr IiaAlaS^rSerGly II 

AG AGXGG ACGGCGG AGGCCGCXC6GG ACXXXCXGG AGGG AGXGXGGGGXGGGCCCGGGGC 
3361 + - * * + 3420 

aG 1 u X r p Xhr A 1 aG 1 u A 1 a A 1 aAr g A «s p Pfte L euG 1 uG 1 y V a I Xr pG 1 y G 1 y Pr o G I y A 1 

AGCCCAGGACAACXXXAXCAG CGXGGCCGAGCCGGXC AGC ACCGCGXCGCAGGCCXCGGC 

3421 + + + -r -i- 3480 

aAlaGlnA-spAsnPhe 1 1 eSer Va 1 A 1 aGl uPr oVa i Ser Xhr A laSer G 1 n A laSer A 1 

* 

CGGGCXGCXGCXGGGXGGAGG AGGGCAGGGCXCCGGGGGCAGACGCAAGCGCCGXCXGGC 
3481 + +• + + + 3540 

aGly LeuLeuueuGlyGlyGlyGlyGlnG lySerGlyG lyArgAr ^Ly^AroAr gLsuA 1 

Xhol 

CACCGXXCXCCCCGGACTCGAGGXCXAGAGACCCCTGGGGCGGCGAXGXCGGGGCTGCXG 

3541 +- + + + + + 3600 

aXhr Va lLeuProGlyLeuGluValEnd 

GCGGCGGCGXACAGCCAGGXGXACGCCCXGGCGGXXGAGCXGAGCGIGXGCACCCGGCXG 
3601 +■ + + -t- + 3660 

GACCCCCGGAGXCXGGACGXGGCXGCGGXGGXGCGCAACGCCGGCCXGCXGGCCGAGCXG 
3661 -* * + + + * 3720 
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GAuGCCrttCCTCCTTCCCCGTTTGAGACGGCAGHATGrtCCGTGCAXGCAGCGuCCXGTCC 
3721 -i — _^ — — — + 

Xhol 

CIGGAGCTGGTGCACC"GCXAGAGAACTCGAGmGAGGCCTCXGCCGCGCTGCTCGCCCC" 
3731 — — — ——— ——— —————— — -^————— — — — — 4.— ——— — — — — — .^ — ———— ———— —— — — ——— — 4. 

GGXAGAAAGGG 
3341 •§■- 33S1 



3340 
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Figure 4: 

Restriction map of the plasraids pUC635 and pUC6130. 
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Figure 5: 

Expression of the.p138 fusion protein encoded by pUC635 r 
PUC924, pMF924, and pKK378 

1 2 3 4 5 
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Figure 6 : 

Restriction map of the plasmid pUC924 
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Figure 7: 

Restriction map of the plasmid pMF924. 
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Figure 8 ; 

Restriction map of the plasmid pKK378. 
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Figure 9 : 

Secondary structures of p138. 
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Figure 1 0: 

Expression products of bacteria transformed with 
the pUR-carrying PstI fragments of pi 38* 
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Figure 1 1: 

Expression products of bacteria transformed with the 
pCJC subclones carrying PstI- fragments of p138. 
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Figure 12: 

Construction scheme for pUCARG1140 encoding both 

antigenic sites found by expression as B-gal fusion proteins 
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+ oligo-arg linger 




b) 



PstI Hindi II 

5 3' 
6 CGT CGT CGT CGT CGT TGA TA 

AC GTC GCA GCA GCA GCA GCA ACT ATT CGA 

Arg Arg Arg Arg Arg stop stop 
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Figure 1 3 ; 

IPTG-induced expression of the plasmlds pUC60Q, pUC601 , 
PUCARG601 and pUCARGTT40 with pUC8 as a control 
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Figure 14: 

Distribution and reactivity of the IgG and IgA 
antibodies of individual NPOsera against the two 
epitopes detected in pi 38 
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Figure 1 5 ; 



ELISA test: using the protein encoded by PUCARG1140 as antigen 
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Figure 16; 

Purification of proteins carrying oligo-arginine 
peptides at their car boxy- terminus. 



s 'g cgt cgt cgt cgt cgt tga ta . , . . 
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Figure 17 

GGATCCGAAAAACTGGTCTATGGCTCGTGTGTCGATGCGCTGAAACCAACGGCAACAAAT 
1 ♦ ♦ ♦ * + 60 

TACTTACCTTGTTGTTGTGTGATGGGTAAAAACACACATCACACACTTAGGCCATAGGGA 
61 -» 1- ♦ * i- * 120 

TGCTCACCGTAGCCGCGGCTCCAATCGCTTGAAGAAGTGTTCTTAGATCTAGTGGAAACC 
121 * * ♦ * ♦ + 1Q0 

TGCGGAGAATGGCTTCTCGCCCAGGGAGATCCGGCTGGGGTGGGAGCATGGGTCGTGCTG 
181 f * ^ * ♦ * 240 

GAGCTGACCCACCGGCATCATGATCGACCCGCTTTCTCTTCGTACCCTTCTGGGCCGGCT 
241 f * ♦ ♦ ♦ 3Q0 

CCAGGTGGGCATCTTCTGCTTCCTTTTCTGAGCTGCTATCTGATAACTCTATGAGGACAT 
301 +■ * 1- 360 

TTTCCCAATCTCCCGCCGATACCTGTTCCTGCACAACCGAGGTAGATGGGACTTCTTCTT 
361 * ♦ » 1- -f -i- A20 



CCATGTTGTCATCCAGGGCCGGGGGACCCGGCCTGTCCTTGTCCATTTTGTCTGCAACAA 
421 ♦ + * -r 

AAGTGTGACTCACCAACACCGCACCCCCCTTGTACCTATTAAAGAGGATGCTGCCTAGAA 
481 * * » f * *■ 



GGCAAAAAGCATCAACTTGATCTTGACTTTGGCCAGCTGACACCCCATACGAAGGCTGTC 
721 + * * * * * 

G1 yLysLysi-H sGl nLeuAspLeuAspPheGI yG I nL»uThpProH1 sThrLysA 1 aVa 1 



480 
540 
600 



ATCGGTGCCGAGACAATGGAGGCAGCCTTGCTTGTGTGTCAGTACACCATCCAGAGCCTG 
541 * *- * * * + 

MetGluA 1 aAlaLeuLeuValCysG I nTyrThrl 1 aGlnSerLeu 

EcoRI 

ATCCATCTCACGGGTGAAGATCCTGGTTTTTTCAATGTTGAGATTCCGGAATTCCCATTT 

601 * + ♦ + t- + 660 

1 1 »H1 sLeuThrGl yGl uAspProGl yPhePneAsnVa 1 Gl ul 1 eProGl uPfteProPfte 

TACCCCACATGCAATGTTTGCACGGCAGATGTCAATGTAACTATCAATTTCGATGTCGGG 

661 ■* -* ♦ + f 720 

TyrProThrCysAsnVa 1 CysTrtrA 1 aAspVa 1 AsnVa 1 Thr 1 1 eAsnPheAspVa 1 G 1 y 



780 



TACCAACCTCGAGGTGCATTTGGTGGCTCAGAAAATGCCACCAATCTCTTTCTACTGGAG 

781 ■*■ * ♦ + ♦ 840 

TyrGlnProArgGl yA 1 aPheGl yGl ySerGI uAsnAl aThrAsnLeuPheLeuLeuGl u 

Hlndlll 

CTCCTTGGTGCAGGAGAATTGGCTCTAACTATGCGGTCTAAGAAGCTTCCAATTAACGTC 

841 * -r + -»■ + -r 900 

LauUeuGl yA 1 aG 1 yG 1 uLeuA l atauThrMe tArgSerLysLysLeuProI 1 eAsnVa t 

ACCACCGGAGAGGAGCAACAAGTAAGCCTGGAATCTGTAGATGTCTACTTTCAAGATGTG 

901 + * * ♦ 960 

TnrThrGI yGl uGl uGl nG 1 nva 1 SerLauG 1 uSarVa I AspVa 1 TypPheGl nAspVa 1 

TTTGGAACCATGTGGTGCCACCATGCAGAAATGCAAAACCCCGTGTACCTGATACCAGAA 
961 1- - * * ♦ 1020 

PneGl yThrMetTPDCysHisHlsAlaGluMetGlnAsnProValTypLBuI 1 ePpoGlu 



ACAGTGCCATACATAAAGTGGGATAACTGT AATTCTACCAATATAACGGCAGTAGTGAGG 
1021 «■ * ♦ * + f 



1080 



22/64 



01732 



loai 



1141 



1201 



1261 



1321 



1381 



1441 



1501 



1561 



1621 



1681 



1741 



1801 



1861 



1921 



- 2 - 

Thrva I ProTyrI 1 eLysTrpAapAanCysAanSerThrAsnl 1 •ThrA 1 aVa 1 Va I Arg 

-5-"-555" G ^ TGTCACGCTACC ^^ 

A 1 aG I nG 1 yLauAspva I ThrLeuPPOLeuSarLeuProThrSapA 1 aG 1 nAspSepAsn 

TTCAGCGTAAAAACAGAAATGCTCGGTAATGAGATAGATATTGAGTGTATTATGGAGGAT 

PMaSerVal LyaThrGl uMetLeuGI yAsnGl ul 1 aAsp! 1 eGluCysI 1 eMetGluAsp 

Pa 1 1 

GGCGAAATTTCACAAGTTCTGCCCGGAGACAACAAATTTAACATCACCTGCAGTGGATAC 

Gi yG 1 ul 1 eSerG i nVa I LeuProG 1 yAapAsntysPheAsnl 1 eThrCy aSerG 1 vTvr 

Ecoftl 

GAGAGCCATGTTCCCAGCGGCGGAATTCTCACATCAACGAGTCCCGTGGCCACCCCAATA 
G 1 uSerHlsVa 1 ProSerG 1 y G 1 y 1 1 eLeuThrSapThrSerPpoVa 1 A 1 aThrProI 1 e 

CCTGGTACAGGGTATGCATACAGCCTGCGTCTGACACCACGTCCAGTGTCACGATTTCTT 

— — — — — — — _ — _ — — — — — — _ .^ — — — ____ — 

ProGl yThrGI yTyrA I aTy rSerLeuApgLauThrProArgPpoVa 1 SerArgPheLeu 
GGCAATAACAGTATCCTGTACGTGTTTTACTCTGGGAATGGACCGAAGGCGAGCGGGGGA 
GlyAanAsnSePl 1 eLeuTyrVa 1 PhaTyrSarGl yAsnGl yProLyaA 1 aSerGl yGl y 

GATTACTGCATTCAGTCCAACATTGTGTTCTCTGATGAGATTCCAGCTTCACAGGACATG 
AspTyrCyal I oG 1 nSarAsnl 1 eVa I PneSarAapG lull aProA 1 aSarG I nAspMet 

CCGACAAACACCACAGACATCACATATGTGGGTGACAATGCTACCTATTCAGTGCCAATG 
ProThpAsnThrThpAspI 1 eThpTy rVa I G 1 yAspAsnA 1 aThrTyrSepVa 1 ProMat 

GTCACTTCTGAGGACGCAAACTCGCCAAATGTTACAGTGACTGCCTTTTGGGCCTGGCCA 

"~ ~" — — — — — — -♦. — — — _ — — _ + — — __ — — ,^._ — — — — — — — ^. 

va 1 TnrSerG 1 uAspA 1 aAsnSarProAsnVa t Thpva l TnrA 1 aPhaTrpA 1 aTrpppo 

AACAACACTGAAACTGACTTTAAGTGCAAATGGACTCTCACCTCGGGGACACCTTCGGGT 
"* — ~* ~* -♦■ — — — — — — ■+- — — — — — — ———————— — — — — — — _ + 

AsnAanThpGluThrAspPheLyaCyaUyaTppThPLeuThpSepGl yThrProSerGI y 

TGTGAAAATATTTCTGGTGCATTTGCGAGCAATCGGACATTTGACATTACTGTCTCGGGT 

"~ """""" "*" ~ ~* + ~~ — — — — — — — + 

CysGluAsnl 1 eSepG 1 yA 1 aPheA 1 aSepA«nAp B ThpPheAspI 1 eThrVal SorGly 

53^GCACGGCCCCCAAGACACTCATTATCACACGAACGGCTACCAATGCCACCACAACA 
LeuG 1 yThPA 1 aPpoLysThPLaul 1 al 1 aThpArgThrA I aThrAanA 1 aTnrTnrThr 

ACCCACAAGGTTATATTCTCCAAGGCACCCGAGAGCACCACCACCTCCCCTACCTTGAAT 
ThrMlstysVa l I 1 ePhaSepLysA I aProGl uSarThrTMrTMrSarProThrueuAsn 

ACAACTGGATTTGCTGATCCCAATACAACGACAGGTCTACCCAGCTCTACTCACGTGCCT 

~"~ — — — — — — — — — — — — — — -t- — — — — — — — _ — ^. — — _ — _ — _ 

ThpThpGlyPheAlaAspPPoAsnThpThrThpGlyLeuPpoSapSepThpHI aVa 1 Pro 

ACCAACCTCACCGCACCTGCAAGCACAGGCCCCACTGTATCCACCGCGGATGTCACCAGC 
— - """*'""" ~ — -*- — — -t- 

ThpAsnLeuThrA 1 aProA I aSerTrirG 1 yProTr»rVa 1 SerThrA 1 aAapVa 1 ThrSer 
CCAACACCAGCCGGCACAACGTCAGGCGCATCACCGGTGACACCAAGTCCATCTCCATGG 



1140 



1200 



1260 



1320 



1380 



1440 
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1620 
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1740 
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1980 
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1981 * 1- + + 2040 

ProThrProA 1 aG 1 y ThrThrSerG J yA 1 aSerProVa I ThrProSa rProSerProTrp 

GACAACGGCACAGAAAGTAAGGCCCCCGACATGACCAGCTCCACCTCACCAGTGACTACC 

2041 * - * * 210O 

AspAsnGI yThrGl uSePLysA 1 QProAspMatThrSerSerThrSarProVa 1 ThrThr 

CCAACCCCAAATGCCACCAGCCCCACCCCAGCAGTGACTACCCCAACCCCAAATGCCACC 
2101 «. ^„ „^ 216Q 

ProThrProAsnA 1 aThrSerProThrProA 1 aVa 1 TMrTnrProThrProAanA I aThr 

AGCCCCACCCCAGCAGTGACTACCCCAACCCCAAATGCCACCAGCCCCACCTTGGGAAAA 

2161 + «- + ^ +. 222Q 

SerProThrProA 1 aVa I ThrThrProThrProAsnA 1 aThpSarProThrLauG 1 yLy s 

ACAAGTCCTACCTCAGCAGTGACTACCCCAACCCCAAATGCCACCAGCCCCACCTTGGGA 

2221 + ^ ♦ f 22BO 

ThPSapppoThPSapA 1 ava 1 ThpThpppoThrPpoAsnA I aThPSapPPoThPLauG I y 

AAAACAAGCCCCACCTCAGCAGTGACTACCCCAACCCCAAATGCCACCAGCCCCACCTTG 
22B1 ♦ - * * + 2340 

LysThrSapProThpSerA 1 aVa I ThrThrProThrProAsnA 1 aThrSerProThruau 

GGAAAAACAAGCCCCACCTCAGCAGTGACTACCCCAACCCCAAATGCCACCGGCCCTACT 

2341 - ♦ «. * ♦ 2400 

G 1 yi-ysThrSerProThrSerA 1 aVa 1 ThrThrProThrProAsnA I aThrGl yProThr 

GTGGGAGAAACAAGTCCACAGGCAAATGCCACCAACCACACCTTAGGAGGAACAAGTCCC 
2401 * * ^ + 2460 

va 1 G l yG 1 uThrSerProG 1 nA 1 aAsnA 1 aThrAsnrli sThrLeuG 1 yGl yThrSorPro 

ACCCCAGTAGTTACCAGCCAACCAAAAAATGCAACCAGTGCTGTTACCACAGGCCAACAT 

2461 * ~ + f ♦ ^ 2520 

ThrProValVa I ThpSerGlnPPOLysAsnA 1 aThrSarA 1 aval ThrThrGl yGl nH1 s 

AACATAACTTCAAGTTCAACCTCTTCCATGTCACTGAGACCCAGTTCAAACCCAGAGACA 
2521 * * + + + 2580 

Asnl 1 eThrSerSerSerThrSerSerMetSerLeuArgProSerSarAanProG 1 uThr 

CTCAGCCCCTCCACCAGTGACAATTCAACGTCACATATGCCTTTACTAACCTCCGCTCAC 

25B1 + + + 2640 

ueuSarProSerThrSerAspAsnSarThrSerMl sMetProLeuLeuThrSarA 1 aH1 s 

CCAACAGGTGGTGAAAATATAACACAGGTGACACCAGCCTCTATCAGCACACATCATGTG 
2641 * * + + * 2700 

ProThrGl yGl yGluAsnl 1 eThrG 1 nva 1 TnrProA 1 aSer 1 1 eSerThrrH sm 1 sVa 1 

TCCACCAGTTCGCCAGCACCCCGCCCAGGCACCACCAGCCAAGCGTCAGGCCCTGGAAAC 

2701 — + — - + +■ — — -»■ 27 60 

SerThrSerSerProA 1 aProArgProGl yThrThrSarG 1 nA 1 aSerGl yProGl yAsn 

AGTTCCACATCCACAAAACCGGGGGAGGTTAATGTCACCAAAGGCACGCCCCCCCAAAAT 
2761 ♦ + + ^ 2820 

SerSerThpSerThPLysPPoGl yGluVal Asnval ThrUysGl yThrProProG 1 nAsn 

GCAACGTCGCCCCAGGCCCCCAGTGGCCAAAAGACGGCGGTTCCCACGGTCACCTCAACA 

2821 ■* -r ♦ + * ^ 2880 

A 1 aThrSepPpoGl nA 1 aProSerGI yGlnLysThrA 1 aVa 1 ProThrVa 1 ThrSerThr 

GGTGGAAAGGCCAATTCTACCACCGGTGGAAAGCACACCACAGGACATGGAGCCCGGACA 
2881 * -r 1- * + ♦ 2940 

G l yG I yUysA 1 aAsnSerThrThrG 1 yG 1 yLysMI sThrThrGl yHi sG I yA I aArgThr 
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2941 



3001 



3181 



3301 



- 4 

AGTACAGAGCCCACCACAGATTACGGCGGTGATTCAACTACGCCAAGACCGAGATACAAT 
— — — — — — — — ——— —— — — — — t- — — — — ~ _ _ _ — 

SarThrG 1 uProThrThrAspTyrG 1 yG I yAspSerThrThrProArsProArgTyrAsn 

GCGACCACCTATCTACCTCCCAGCACTTCTAGCAAACTGCGGCCCCGCTGGACTTTTACG 
—— — .. — — ~ — — — — — — — — — — — — — — — * — — — 

A I aThrThrTyrLeuProProSerThrS«rSerLyaL«uArgppoApgTrpThrPh»Thr 



CTGCTGCTGGTCATGGCGGACTGCGCCTTTAGGCG T AACTTGTCTACATCCCATACCTAC 
— — — ~ ~"^ — — — — — — — — — — — — — — — — — — — — — — — — — —— — — — — — — — — — ^ — — — — — 

LeutauLeuVa I Mat A 1 aAspCysA 1 aPheArgArgAsnLauSarThrSerHlsthrTyr 



CAGAAATTTGCACTTTCTTTGCTTCACGTCCCCGGGAGCGGGAGCGGGCACGTCGGGTGG 



CGTTGGGGTCGTTTGATTCTCGTGGTCGTGTTCCCTCACC 
3361 + + + 3400 



3000 



3060 



AGCCCACCGGTTACCACAGCCCAAGCCACCGTGCCAGTCCCGCCAACGTCCCAGCCCAGA 

3061 + * + + * ♦ 3120 

SerProProVa 1 ThrThrA 1 aG 1 nA 1 aThPVa 1 ProVa 1 ProProThrSerG 1 nProAro 

Pat I ********************* 

TTCTCAAACCTCTCCATGCTAGTACTGCAGTGGGCCTCTCTGGCTGTGCTGACCCTTCTG 

3121 + + + •»■ * + 3180 

PheSarAsnUeuSepMa t LeuVa 1 UauG I nTrpA 1 aSeruauA 1 aVa 1 LeuThrLeuLeu 



3240 



ACCACCCCACCATATGATGACGCCGAGACCTATGTATAAAGTCAATAAAAATTTATTAAT 
3241 + -i- ♦ + + + 3300 

ThpThpppoPpoTy PAspAspA 1 aG 1 uTnrTyrVa I End 



+ 3360 
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Figure 19 
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ACCATGATTACGGATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGC 
x * * + * + + 60 

Met I 1 eThr AspS* rLeuA I aVa 1 Va 1 LouGI PtArgArgAspTrpG 1 uAsnProGl y 

GTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAA 

61 * ♦ * * * * 120 

Va 1 ThrG 1 nLeuAsnArgLeuA 1 aA ! aHlsProProPh«A J aStrTrpArgAsnSerG 1 u 

GAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCTTT 

121 * — * * * 180 

Gl uA 1 aArgThrAapArgProSerGlnGlnLeuArgSorteuAsnGlyGluTrpArgPhe 

GCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCTGAG 

181 + + + * * 240 

A 1 aTrpPheProA t aProGluA 1 aVa t ProG 1 uSarTrpLeuGl uCysAapLauProG 1 u 

GCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCCCATCTAC 

241 + + + * •+ + 300 

A 1 aAspThrValVa 1 va 1 ProSarAsnTrpGl nMetHl sG 1 yTyrAspA 1 aProI 1 aTyr 

ACCAACGTAACCTATCCCATTACGGTCAATCCGCCGTTTGTTCCCACGGAGAATCCGACG 

301 * f * + ♦ ♦ 360 

ThrAsnVa 1 ThrTy rProI 1 eThrVa 1 AsnProProPheva 1 ProThrG 1 uAsnProThr 

GGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGAAGGCCAGACGCGA 

361 * * + - + -- + 420 

G 1 yCysTy rSerueuThrPheAsnVa 1 AspG I uSarTppLeuGlnGluG 1 yG 1 nThrArg 

ATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGCGCTGGGTCGGT 

421 ♦ * ♦ ♦ * ■** 480 

Hell ePheAspGI yVa I AsnSarA 1 aPhaH i sLeuTrpCysAsnG 1 yArgTrpVa I Gl y 

TACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACGCGCCGGA 

481 ♦ * * * + 540 

TyrGl yG 1 nAspSerArgLauProSarGI uPhaAspLeuSerA 1 aPheLeuArgA 1 aGl y 

GAAAACCGCCTCGCGGTGATGGTGCTGCGTTGGAGTGACGGCAGTTATCTGGAAGATCAG 

541 ♦ + + * 600 

Gl uAsnArgLauA I aVa lMetVal LeuArgTrpSarAspGI ySarTyrLeuG I uAspGl n 

GATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACCGACTACA 

601 + * * ♦ *" 660 

AsDMatTrpArgMetSerGl yl 1 aPhaArgAapVal SerLeuLouHi sLysProThrThr 

CAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATGATTTCAGCCGCGCTGTACTG 

661 * * — — — 720 

G 1 nl 1 aSarAspPheHlsVa 1 A 1 aThrArgPhaAsnAspAspPheSerArgA 1 ava 1 Leu 

GACGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTACCTACGGGTAACAGTTTCTTTA 

72i -»■ 1- + ♦ * 780 

GluAl aGl uVa 1 Gl nMetCysGl yG 1 uLeuArgAspTyrLeuArgVal ThrVa 1 SerLeu 

TGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGCCTTTCGGCGGTGAAATTATCGAT 

781 ♦ * * ♦ ^ 840 

TrpGlnGl yGluThrGlnVa 1 A 1 aSerGl yThrAI aProPheGI yGl yGluI 1 el 1 eAsp 

GAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACJCCGAAACTG 

641 1- * * * * *** 900 

G 1 uArgGl yGl yTyrA 1 aAspArgVa 1 ThrLeuArgLeuAsnVa I Gl uAsnProLyst.au 

TGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGCCGACGGC 
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901 ^. * ♦ * «- f 960 

TrpSerA ] aGluI 1 aProAsnLauTyrArgA laval ValGl uLeuMlsThrAI aAspGl y 

ACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGCGAGGTGCCGATTGAAAATGGT 

961 + + 1- * 1020 

Thri_eul leGluAlaGluAl aCysAspva I G 1 yPneArgG 1 uVa 1 ArgI 1 aGl uAsnGl y 

CTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGAGCATCAT 

l02 i * - ♦ * + 1080 

LeuLeuLeuLeuAsnGlyLysProLeuLeuI 1 eArgGl yVa 1 AsnArgHl sGl urti sMi s 

CCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATATCCTGCTGATGAAG 

1081 * * ♦ + * * 11A0 

ProLeuHl sGl yGlnVa 1 Mat AspGl uGl nThrMetVa IGlnAsp1 1 eLeuLeuMttLys 

CAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCGAACCATCCGCTGTGGTACACG 

+ + +. +■ ♦ ■»- 1200 

GlnAsnAsnPhaAsnAUVa lArgCysSarHi sTyrProAanHI sProLauTrpTy rThr 

CTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCACGGCATG 

1201 * *" * * * 1260 

LeuCysAsDArgTy rGl yLeuTy rVa 1 Va 1 AspGl uA 1 aAsnl I aGl uThrHl sG 1 yMe t 

GTGCCAATGAATCGTCTGACCGATGATCCGCGCTGGCTACCGGCGATGAGCGAACGCGTA 

1261 ♦ - * * * * 1320 

Va l ProMatAsnArgueuThrAsoAspProArgTrpUauProA 1 aMatSarG I uArgVa 1 

ACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCTGGGGAAT 

1321 f + •*• ♦ 1360 

ThrArgMs tva I G1 nArgAspArgAsnHI sProSerVal I 1 el 1 aTrpSarLauGlyAsn 

GAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGCTGGATCAAATCTGTCGATCCT 

1381 - * * 1**0 

GluSarGl yHisGlyAl aAsnHI sAspA I aLeuTyrArgTrpI 1 aUysSerVal AspPro 

TCCCGCCCGGTGCAGTATGAAGGCGGCGGAGCCGACACCACGGCCACCGATATTATTTGC 

♦ - ♦ + 1500 

SarArgProVa I G t nTyrG 1 uGl yG1 yGI yA 1 aAspThrThrA 1 aThrAspI lei 1 eCys 

CCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGGCTGTGCCGAAATGGTCCATC 

1501 ♦ * + * * * 1560 

ProMetTyrA \ aArgva t AapG 1 uAspGl nProPheProA 1 ava 1 ProLyaTrpSarl 1 e 

AAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTGATCCTTTGCGAATACGCCCAC 

1S61 * * - 1620 

LysLysTrpLeuSerLauProG I yGluThrArgProLeuI 1 eLeuCysG I uTyrA IaH1s 

GCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGCAGGCGTTTCGTCAGTATCCC . 

1621 <- * *■ * * 1690 

A 1 aMetGl yAsnSerLauGI yGI yPheA 1 BLysTyrTrpGlnA 1 aPheArgGlnTyrPro 

CGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAGTCGCTGATTAAATATGATGAA 

1681 * * * * I 7 * 0 

ArgLeuGl nGl yGI yPheVa 1 TrpAspTppva 1 AspGl nSerLeuI 1 eLysTyrAspG 1 u 

AACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGCCGAACGATCGCCAG 

l7A1 * ~ » 1- ♦ ♦ 1800 

AsnG 1 yAsnProTrp5arA I aTyrGl yG 1 yAspPheGl yAspThrProAanAspArgG 1 n 

TTCTGTATGAACGGTCTGGTCTTTGCCGACCGCACGCCGCATCCAGCGCTGACGGAAGCA 

1801 * + * * + + I 860 

PheCysMatAsnGI yLauVa 1 PhaA I aAapArgThrProH 1 sProA 1 aLauThrG I uA 1 a 
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AAACACCAGCAGCAGTTTTTCCAGTTCCGTTTATCCGGGCAAACCATCGAAGTGACCAGC 

1861 •*• ♦ * ■»- 1920 

LysHi sGlnGlnGl nPhoPhaGl nPhaArgLeuSarGI yGI nThrl 1 eGl uValThrSer 

GAATACCTGTTCCGTCATAGCGATAACGAGCTCCTGCACTGGATGGTGGCGCTGGATGGT 

1921 + + + ♦ + 1980 

GluTyrLauPhaArgMi sSerAspAsnG 1 uL«uLeuH1 sTrpMa t Va 1 A 1 aLsuAspGI y 

AAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAAACAGTTGATT 

1981 * * f + 2040 

LysProLauAlaSerGlyGluVal ProLeuAspVa 1 A 1 aProG I nG 1 yLysGlnLaul 1 e 

GAACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGCAACTCTGGCTCACAGTACGCGTA 

2041 + + «- 1- +■ — ♦ 2100 

G I uLeuProG 1 uLeuProG \ nProG 1 uSerA 1 aG 1 yG 1 nLauTrpLauThrVa 1 ArgVa I 

GTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCAGCAGTGG 

2101 + * ♦ * * * 2160 

ValGlnProAsnA 1 aTnrA 1 aTrpSerGI uA 1 aGl yM1 si 1 eSorA 1 aTrpGlnGlnTrp 

CGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCCGCATCTG 

2161 + + + 1- ♦ -k 2220 

ArgLeuA 1 aG 1 uAsnLeuSe rVa 1 ThrLeuProA 1 aAlaSerHisAl al 1 aProHl sLeu 

ACCACCAGCGAAATGGATTTTTGCATCGAGCTGGGTAATAAGCGTTGGCAATTTAACCGC 
2221 * ■*■ + * ♦ ♦ 22BO 

ThrThrSerGluMatAspPheCyal 1 oGl uLeuGl yAsnLysArgTrpGI nPheAsnArg 

CAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAACAACTGCTGACGCCGCTG 

2281 «• 1- + * 2340 

GlnSerG 1 yPheLeuSerGlnMetTrpI 1 aGl yAspLysLysG 1 nLeuLeuThrProLeu 

CGCGATCAGTTCACCCGTGCACCGCTGGATAACGACATTGGCGTAAGTGAAGCGACCCGC 

2341 + * + + 2400 

ArgAspG 1 nPheThrArgA I aProLeuAspAsnAapI 1 aGl yVal SerGluAlaThrArg 

ATTGACCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATTACCAGGCCGAAGCA 
2401 -« ♦ + + + 2460 

1 1 eAspProAsnA 1 aTrpVa 1 G 1 uArgTrpLysA 1 aA 1 aG 1 yH 1 sTy rG 1 nA 1 aG 1 uA t a 

GCGTTGTTGCAGTGCACGGCAGATACACTTGCTGATGCGGTGCTGATTACGACCGCTCAC 

2461 + + - + + + 2520 

A 1 aLeuLeuG 1 nCysThrA 1 aAapThrLeuA 1 aAspA 1 aVa 1 Laul 1 aThrThrA 1 aH1 s 

GCGTGGCAGCATCAGGGGAAAACCTTATTTATCAGCCGGAAAACCTACCGGATTGATGGT 

2521 f ♦ * + + 1- 2580 

A I aTrpG 1 nrHaGlnGl yLysThrLeuPhel t eSerArgLysThrTyrArgl 1 eAspGly 

AGTGGTCAAATGGCGATTACCGTTGATGTTGAAGTGGCGAGCGATACACCGCATCCGGCG 
2581 + ♦ + ^ 2640 

SerG 1 yG 1 nMe t A 1 al 1 eThrVa 1 AspVa I G 1 uVa 1 A 1 aSarAapThrProHl aProA 1 a 

CGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCTCGGATTA 

2641 + *■ + ■*■ + * 2700 

ArgI I eGlyL.euAsnCysGlni.euA 1 aGl nVa 1 A 1 aGl uArgVa 1 AsnTrpLeuG 1 yLeu 

GGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGCCTGTTTTGACCGCTGGGATCTG 

2701 - * ♦ + + 1- 2760 

Gl yProG 1 nGl uAsnTy r Pro AspArgLeuThrA 1 aA 1 aCysPneAspA rgTrpAspL.au 

CCATTGTCAGACATGTATACCCCGTACGTCTTCCCGAGCGAAAACGGTCTGCGCTGCGGG 
2761 ■* * > +■ <► + 2820 

ProLeuSerAspMatTyrThrProTyrVa I PhaProSerGI uAsnGl yUeuArgCysGI y 
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A CGCGCGA ATTGA A TT A TGGC CCACACC AGTGGCGC GGCGA CTTC C A6TT CA A C A TC AGC 

2B21 ♦ * ♦ * * * 2880 

TnrArgGl uLauAanTyrGl yProHlaGlnTrpArgGl yAapPtiaGl nPTiaAanl 1 «5«r 

CGCTACAGTCAACAGCAACTGATGGAAACCAGCCATCGCCATCTGCTGCACGCGGAAGAA 

2881 * * * * * * 2940 

ArgTy rSarGI nGl nG \ nLauMatGl uThrSarHl aArgt-H al-aul_auMl sA \ aGI uGl u 

GGCACATGGCTGAATATCGACGGTTTCCATATGGGGATTGGTGGCGACGACTCCTGGAGC 

2941 ♦ * ♦ ♦ * ♦ 3000 

Gl yThrTrpl_«uA«nI 1 aAapGI yPhaHl sMttGl y I 1 •GlyGl yAapAapSarTrpSar 

CCGTCAGTATCGGCGGAATTCCAGCTGAGCGCCGGTCGCTACCATTACCAGTTGGTCTGG 

3001 * + ♦ * * ♦ 3080 

ProS«rV« 1 SarA 1 aGI uPftaGI nL«uS«rA 1 aGI yArgTyrHI aTyrG 1 nUauVal Tr-p 



TGTCAAAAAgggflatccotcoacctocaGTGGATACGAGAGCCATGTTCCCAGCGGCGGA 

CyaG 1 nt_y aGI yAapPraSarTrirCy aSarG 1 yTy rGl uSarM t aVa 1 ProSarGI yG 1 y 
ATTCTCACATCAACGAGTCCCGTGGCCACCCCAATACCTGGTACAGGGTATGCATACAGC 



3X2- ♦ ♦ -> 3180 

1 1 ai-auThrSarThrSarPcova 1 A 1 aTnrProI 1 aProG 1 yThrGl yTyrA 1 aTyrSar 

CTGCGTCTGACACCACGTCCAGTGTCACGATTTCTTGGCAATAACAGTATCCTGTACGTG 

3181 * * * * — * * 3240 

uauArguauTrirProArgProva 1 SarArgPhaLauGl yAsnAanSar I 1 aLauTy rVa 1 

TTTTACTCTGGGAATGGACCGAAGGCGAGCGGGGGAGATTACTGCATTCAGTCCAACATT 

32A1 * * + * * + 3300 

PhtTyrSarGt yAsnGlyProLysA 1 aSarGl yGl yAapTyrCyal 1 aGI nSafAanl 1 a 

GTGTTCTCTGATGAGATTCCAGCTTCACAGGACATGCCGACAAACACCACAGACATCACA 

3301 + * * * * * 3360 

va 1 PhaSarAapGI ul 1 aProAl aSarG 1 nAapMat ProThrAanThrThrAapI 1 aThr 

TATGTGGGTGACAATGCTACCTATTCAGTGCCAATGGTCACTTCTGAGGACGCAAACTCG 
23gx — — —— — — — — -♦. — — ——— — — — —♦■ — -——- —-——+————— _———.♦.—————— ——*.———————— — ♦ 3420 

TyrVft 1 G 1 yAspAtnA 1 aTMrTy r SarVa l ProMttvt l ThrSarGl uAapA 1 aAanSar 

CCAAATGTTACAGTGACTGCCTTTTGGGCCTGGCCAAACAACACTGAAACTGACTTTAAG 

3421 «. ♦ * * ♦ * 3480 

ProAanva 1 TMrVa 1 TrtrA 1 aPnaTrpA 1 »TrpProA«nAsnThpGl uTHrAapPMaUys 

TGCAAATGGACTCTCACCTCGGGGACACCTTCGGGTTGTGAAAATATTTCTGGTGCATTT 

34 81 * + ♦ ♦ ♦ + 3540 

CySLysTppThrLtuThrStrGI yThrProSarGl yCysGl uAanl 1 aSarGl yA 1 aPfta 

GCGAGCAATCGGACATTTGACATTACTGTCTCGGGTCTTGGCACGGCCCCCAAGACACTC 

3541 * * * * 3600 

A 1 aSarAanArgTnrPMaAapI 1 aTrirVa 1 SarG 1 y LauGI yTftrA 1 aProLyaTnrUau 

ATTATCACACGAACGGCTACCAATGCCACCACAACAACCCACAAGGTTATATTCTCCAAG 

3601 ---------- — -».--- + ♦ — — ♦ 3660 

1 1 al 1 aThrArgThrA 1 aThpAsnA 1 aThrTnrThrThPMl iLysVt 1 1 1 aPMaSarLya 

GCACCCGAGAGCACCACCACCTCCCCTACCTTGAATACAACTGGATTTGCTGATCCCAAT 

33Q1 - — — — -♦- — — — — — 37 20 

A I aProG l uSarThrThrTrirSarProThruauAanThrTrirG 1 yPtiaA 1 aAapProAan 

ACAACGACAGGTCTACCCAGCTCTACTCACGTGCCTACCAACCTCACCGCACCTGCAAGC 
3721 * ♦ * * * * 3780 
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ThrThrTnrGlyLeuProSarSerThrhllsVal ProThrAsnLeuThrA 1 aProAlaSor 



ACAGGCCCCACTGTATCCACCGCGGATGTCACCAGCCCAACACCAGCCGGCACAACGTCA 

3781 •*■ + ♦ — f * * 3840 

ThrG 1 yProThrVa 1 SerThrA I aAspVa 1 ThpSerProThrProA 1 aG 1 y TnrThrSer 

GGCGCATCACCGGTGACACCAAGTCCATCTCCATGGGACAACGGCACAGAAAGTAAGGCC 
3841 * -f + * ♦ ^ 3900 

GI yA I BS«rProVa 1 ThrProSarProSarPpoTrpAspAsnGl yThrGI uSsrUysA 1 a 

CCCGACATGACCAGCTCCACCTCACCAGTGACTACCCCAACCCCAAATGCCACCAGCCCC 
3901 ♦ ♦ f ^. 3960 

ProAspMatThrSarSorThrSarProValThrThrProThrProAsnA 1 aTMr SerPi-o 

ACCCCAGCAGTGACTACCCCAACCCCAAATGCCACCAGCCCCACCCCAGCAGTGACTACC 

3961 + — ■+ — — — ---> — — + 4020 

TnppPoA 1 ava 1 ThpThpPpoThpppoAanA 1 aTnrSarProThrProA 1 aVa 1 TnrTnr 

CCAACCCCAAATGCCACCAGCCCCACCTTGGGAAAAACAAGTCCTACCTCAGCAGTGACT 

4021 * + + * ♦ 4080 

ppoThpppoAanA 1 aTnrSerProThrLeuG 1 yLysThrSerProThrSarA 1 aVa 1 Thr 

ACCCCAACCCCAAATGCCACCAGCCCCACCTTGGGAAAAACAAGCCCCACCTCAGCAGTG 

40S1 + * + *■ + 1- 4140 

ThpPPoThpppoAsnA 1 aThpSepPpoThpLeuGI yUysThrSerProThrSarA 1 aVa 1 

ACTACCCCAACCCCAAATGCCACCAGCCCCACCTTGGGAAAAACAAGCCCCACCTCAGCA 

4141 -f + * ♦ + + 4200 

TnpThpPpoTnpPpoAsnA 1 aThrSarProThrLauGl yLysThrSerPpoThpSepA 1 a 

- GTGACTACCCCAACCCCAAATGCCACCGGCCCTACTGTGGGAGAAACAAGTCCACAGGCA 

4201 * + * ♦ + 4260 

va 1 ThpThpProThpppoAsnA 1 aThrGl yProThrVa 1 Gl yG 1 uThrSepProGI nA 1 a 

AATGCCACCAACCACACCTTAGGAGGAACAAGTCCCACCCCAGTAGTTACCAGCCAACCA 

4261 + + - — +■ * + 4320 

AsnA 1 aThPAsnHlsThPLauGI yG 1 yThpSapppoThpPpoVa 1 Va 1 ThrSerG 1 nPro 

AAAAATGCAACCAGTGCTGTTACCACAGGCCAACATAACATAACTTCAAGTTCAACCTCT 

4321 + + ■+ +■ * 4380 

LysAsnA 1 aThpSepA 1 aVa 1 ThPThpGl yGl nM1 sAsnl 1 eThrSerSerSerThrSer 

TCCATGTCACTGAGACCCAGTTCAAACCCAGAGACACTCAGCCCCTCCACCAGTGACAAT 
4381 — + — - — i* -t- 4440 

SerMetSarLauArgProSerSerAsnProGluThrLauSerProSerThrSerAspAsn 

TCAACGTCACATATGCCTTTACTAACCTCCGCTCACCCAACAGGTGGTGAAAATATAACA 

4441 ♦ ♦ * -t- * «^ 4500 

SarThpSarHisMatPPoLeuLauThpSepA 1 aH1 aProThrGI yGl yG 1 uAsnl 1 eThr 

CAGGTGACACCAGCCTCTATCAGCACACATCATGTGTCCACCAGTTCGCCAGCACCCCGC 

4501 + ♦ ♦ * * ♦ 4560 

Glnva I ThppPoAl aSerl 1 eSerTnrHI sMi sVa I SerThrSerSerProA I aPPoApg 

CCAGGCACCACCAGCCAAGCGTCAGGCCCTGGAAACAGTTCCACATCCACAAAACCGGGG 

4561 + ^ 1- * + 4620 

ProGl yThpThrSePGlnA I aSerGI yProGlyAsnSerSarThrSerThrLysProGly 

GAGGTTAATGTCACCAAAGGCACGCCCCCCCAAAATGCAACGTCGCCCCAGGCCCCCAGT 

4621 «■ +■ ■«■ f 1- 4680 

Gl u\/a I AsnVa I ThrLysG 1 yThrPpoProGlnAsnA ) aThpSerProG 1 nA 1 aProSar 



GGCCAAAAGACGGCGGTTCCCACGGTCACCTCAACAGGTGGAAAGGCCAATTCTACCACC 



33/64 



017325 



4681 * ♦ * ZfZo 

Gl yG I nLysThrAUVa 1 ProThrVa 1 ThrSarThrG 1 yGl yLysA 1 aAsnSarThrThr 



4-741 



4801 



4921 



GGTGGAAAGCACACCACAGGACATGGAGCCCGGACAAGTACAGAGCCCACCACAGATTAC 
*■ — — — — — — — — _ — — — — — + . #. — — — — 

Gl yGl yLyaHi sThrThrGI yHI sGl yAl aArgThrSerThrGI uProThrThrAapTy r 

GGCGGTGATTCAACTACGCCAAGACCGAGATACAATGCGACCACCTATCTACCTCCCAGC 
— — ——— — — — — — — — ————— — — — — — — —— — — — — — +■ — — 

G I yG I yAspSepThrThrPpoApgppoArgTyrAanA 1 aThpThpTyrLeuProProSor 



GCCACCGTGCCAGTCCCGCCAACGTCCCAGCCCAGATTCTCAAACCTCTCCATGCTAGTA 

■* — — ■+■ + — — — — — — 

A 1 aTtirVa 1 ProVa 1 ProProTMrSarGI nProArgPh»SerAsnLeuSerMa tLauVa 1 



CTGCAGccaagct t ATCGATGATAAGCTGTCAAACATGA 

4981 * * SOi9 

LauG 1 nPpoSarLauSapMatt 1 aSarCyaGlnThpEnd 



4B00 



48 60 



ACTTCTAGCAAACTGCGGCCCCGCTGGACTTTTACGAGCCCACCGGTTACCACAGCCCAA 
4861 «- * * * «. + 4g20 

TrtrSerSarLyst-auArgProArgTrpThrPrtaThrSarPnoProVal ThpThpA 1 aGl n 



4980 
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Figure 2 2 
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Figure 23 




Figure 24 
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Figure 25; 

Expression of <rp3 50-f ragments as B-gal fusion proteins 
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Figure 26: 

Expression of proteins 
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Figure 28 - 1 - 

A: p47 



GXACGXXGGCIXCIGCXGCXGCXXGXGAXCAIGGAAACCACXCAGACTCXCCGCTTXAAG 

7-J36? — *- + + * ^ ~ ' '""'^ 

CAXGCAACCGAAGACGACGACGAACACXAGXACCIXIGGIGAGXCXGAGAGGCGAAAXTC 

METXQTLREK 

ACCAAGGCCCXAGCCGXCCX5XCCAAGIGCXAXGACCAXGCCCAGACXCAXCXCAAGGGA 

7 c»cj29 **— " + + /-.oo 

TGGXXCCGGGAXCGGCAGGACAGGXXCACGAXACXGGX ACGGGXCXGAGXAGAGXXCCCX 

XKALAVLSKCYDHAQTHLKG 

GGAGXGCXGCAGGTAAACCXTCXGXCXGIA AACXAXGGAGGCCCCCGGCXGGCCGCCGIG 

70039 - + + + + + + 30043 

CCXCACGACGXCCAXXXGGAA6ACAGACAXXTGAXACCXCCGGGGGCCGACCGGCGGCAC 



G y l 



QVNLLSVNYGGPRLAAV 



GCCAACGCAGGCACGGCCGGGCXAAICAGCXXCGAGGXCICCCCXGACGCXGXGGCCGAG 

30049 - + + * * + * " 30108 

CGGXXGCGXCCGXGCCGGCCCGAXXAGXCGAAGCXCCAGAGGGGACXGCGACACCGGC*w 

ftNAGXA. GLISEEVSPDAVAE 
XGGCAGAAXCACCAGAGCCCAGAGGAGGCCCCGGCCGCCGXGXCAXXX AGAAACCXXGCC 

_^ H + — — — — + 4 30L&O 

CCGXCXXAGXGGXCXCGGGXCXCCXCCGGGGCCGGCGGCACAGXAAAXCXXXGGAACGG 
WQNHQSPEEAPAAVSERNLA 



30109 



M 



30 1 



IACGGGCGCACCIGIGTCCXGGGCAAGGAGCTGXXXGGCTCGGCXGXGGAGCAGGCXXCC 

.. Q + - + o0-2b 

ATGCCCGCGXGGACACAGGACCCGXXCCXCGACAAACCGAGCCGACACCXCGXCCGAAGG 



Y G 



SXCVLGKELEG3AVEQA3 



CXGC A AXX7X ACA AGCGGCCACAAGGGGGXXCCCGGCCXGAAXXXGXX AAGCXCACXATG 

30229 — -r — — — — — — — ^ T 

GACGXXAAAAXGXTCGCCGGXGXXCCCCCAAGGGCCGGACXXAAACAAXXCGAGXGAXAC 



L G E Y 



KRpQGGSRPEEVKuXh 




EYDDKVSKSHHXCALMPYrtP Vlv 
CCG GCC AG CG AC AGGC XG AGG A ACG AGC A GAXGAXXGGGC AG GXGCXGXXGAXGCCC A AG 
b0 QGCCGGXCGCXGXCCGACXCCXXGCXCGXCXACXArtCCCGXCCACGACAACTACGGGIXC 
f.A3DRLftMEQMIGQVLLhPK 
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TASSLQKUARQQGSGGVKVX 

CXCAAXCC3GAXCXCXACGXCACCACGXAXACXXCXGGGGAGGCCXGCCXCACCCXAGAC 
3046? - + T * SOZ^S 

gagxxaggccxagagaxgcagxggxgcaxaigaagaccccxccggacggagxgggaxcxg 
lhpdlyvxxyxsgeac1.xld 
xacaagccxcxga^xgxggggccai acgaggccxxcacxggcccxgxggccaaggcxcag 

+ - — — — + "•" - — — -t r — — — - 

xgxxcggagacxcacaccccggxaxgcxccggaagxgaccgggacaccggxxccgagxc 
^kplsvgpyeaexgpvakaq 

GACGXGGGGGCCGXXGAGGCCCACGXXGXCXGCXCGGXAGCAGCGGACTCGCTGGCGGC3 



30529 -+ 



30535 



A 



3053? -+ + + + + 

CXGCACCCCCGGCAACXCCGGGTGCAACAGACGAGCCATCGXC5CCTGAGCGACCGCCGC 



30648 



DVGAVEAHVVCSVAADSLAA 

GCGCTXAGCCXCXGCCGCAXXCCGGCCGXXAGCGXGCCAAXCTTGAGGXXXXACAGGXCT 

80649 - + + + + + ^ 30708 

CGCGAAICGGAGACGGCGXAAGGCCGGCAAXCGCACGGTXAGAACXCCAAAATGXCCAGA 

ALSLCRIPAVSVPILREYRS 

GGCAICAXAGCXGXGGXGGCC3GCCXGCXGACGXCAGCGGGGGACCXGCCGXXGGAXCXX 

30709 - + * + + + + 30766 

CCGXAGX AXCGACACCACCGGCCGGACGACXGCAGXCGCCCCCXGGACGGCAACCTAGAA 

GIIAVVAGLLX3AGDLP LDL 

AGXGXXAXXXXAXXXAACCACGCCXCCGAAGAGGCGGCCGCCAGXACGGCCXCXGAGCCA 

80769 -+ + + + T 30323 

TCACAAI A AAAXAAAIXGGXGCGGAGGCIXCXCCGCCGGCGGXCAXGCCGG AGACXCGGX 

SVILENHASEEAAASXASE? 

GArtG AXAAAAGXCCCCGGGXGCAACC ACXGGGCACAGGACXCCmAC AaCGCCCCaGAC AX _ 

3082? + ^ ^ — 

CXXCX AXXXXCAGGGGCCCACGIXGGXGACCCGXGXCCXGrtGGXTGXXGCGGGGXCXGXA 



E 



DK3PRVQPLG X G L J Q ft P S H 



ACGGXCAGXCCAXCXCCXXCACCXCCGCCACCXCCXAGGfiCCCCTACXXGGGAGAGXCCG 

30889 _+ + + + SOSVo 

IGCC AGXCAGGIAGAGGAAGXGGAGGCGGIGGAGG AXGCX3GGGAXG AACCCXCXCAGGC 



X 



V3P3PSPPPPPRXPXU 



GCAAGGCCAGAGACACCCXCGCCXGCCAXICCCAGCCACXCCAGCArtCACCGCACXGGAG 

+ ^_ + + 1 — 1 blUUd 

CGXXCCGGXCXCXGXGGGAGCGGACGGXAAGGGXCG3XGAGGXCGXXGXGGCGXGACCXC 



A R P 



E X P 3 P A IPSHSSN. TALE 



3 



AGGCCXCXGGCXGXXCAGCXCGCGAGGAAAAGGACAXCGXCGGAGGCCAGGCAGAAGCAIj 

^ TCCGGAGACCGACAAGTCGAGCGCXCCXXXXCCIGXAGCAGCCXCC3GXCCGTCXXCGXC 
ft PLAVQLARKStX3 3SAR0KG 
AAGCrtCCCCAAGAAAGXGAAGCAGGCCXXXArtCCCCCXCAXXXAACACCAXGXXCXCGXG 
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81069 — r + + • * 

XXCGXGGGGXXCXXXCACXXCGXCCGGAAAXXGGGGGAGXAAAXXGXGGXACAAGAGCAC 

KHPKKVKQAENPLIA 

CAAijCAGCACCX 

31129 — *• + S1140 

GIXCGXCGXGGA 



- 3 - 

— 311 



43/6-4 



0173254 



3: p90 



- 4 - 



AAGTGQXTCAi3TGGACACCCACCACACAi3CAXGi3CAACGACCAGTCATG , rCGAGCAXQAG 

6377 --—+■ + -r -f- + ■** 76-436 

XXCACCAAGXCACCXGXGGGXGGXGXGXCGXACCGXXGCXGGXCA6XACAGCXCGTACXC 

MATTSrtVEHS 

CXCCXCXCCAAAXXGAXXGAXGAGXX AAA6GXC AfiGGCCAACXCAGACCCCGAGGCTGAX 

6437 + + ^ + + 764*6 

GA6GAGAGGXXXAACXAACXACXCAAXXXCCAGXXCCGGXXG AGXCXGGGGCXCC3ACXA 

LLSKLIDSLKVKANSDPEAD 

GXCCTGGCCGGGCGCCXGCXCCACCGCCXXAAGGCCGAGXCAGXXACACACACAGTAGCC 

'6497 *- 4 — — + — h — ——--—— —-4.——-— -——- — +- — 76556 

CAGGACCGGCCCGCGGACGAGGXGGCGGAAXXCCGGCXCAGXCAAXGXGXGXGXC AXCGG 

VLAGRLLH RLKAESVXHXyA 

GAAXAXCXGGAGGXCXXCXCXGACAAtf XXCXACGAXGAGGA AXTCXXCCAGAXGC ACCGG 

76537- + + + +" + 76616 

CXXAXAGACCXCCAGAAGAGACXKXXAAGAXGCXACXCCXXAAGAAGGXCXACGXGGCC 



736 



•T (1 T O (_ 



EYLEVFSD KFYDEEFFQhHR 

GAXGAGCTGGAGACCCGAGTCTCXGCXXXCGCGCAGAGCCCGGCCXACGAGCGCAICGXC 

76617 + + + + "*" : + 76676 

CXACXCGACCXCXGGGCXCAGAGACGAAAGCGCGXCXCGGGCCG6AXGCTCGCGXAGCAG 

DELEXRV3AFAQSPAYER IV 

XCCAGCGGCXACCTGTCGGCCCXGCGCXACXAXGACACCXAXCXGXAXGXGGGGCGCAGC 
7 6677 + *- + +——---— -—-+-— -——-—-— •>. 

AGGXCGCCGAXGGACAGCCGGGACGCGAXGAXACXGXGGhXAGACAXACACCCCGCGXCG 
3SGYLSALRYY0XYLYVGR:f 

GGGAAGCAGGAGAGXGXGCAGCACXXXXACAXGCGGXXAGCCGGCXXCXGXGCCXCAACC 

7 6737 — — — — 4* — h— 4- — — — — -t-— — — 

CCCXTCGXCCXCXCACACGXCGXGAAAAXGX ACGCCAAXCGGCCGAAGAC ACG6 A6XX6G 

GKQESVQHFYiiRLAGFCASX 

ACCXGCCXCX ACGCGGGXCXCAGGGCAGCCCXGCAGCGGGCC AGGCCGG AGAXXGAGA6X 

7 67 c )7 + — — _——---4-——-—- + h *+■ — 

XGGACGGAGAXGCGCCCAGAGXCCCGXCGGGACGXCGCCCGGXCCGGCCXCXAACXCTCA 

XCLYA6LRAALQRARPE I £ 3 

GACAXGGAGGXGXXXGAXXACXACXXIGAGCACCXAACCXCCCAGACGGXGXGCXGCXCC 

76357 + + «■ + + + 

CXGXACCXCCACAAACXAAXGArGAAACTCGXGGAXXuGAGGGXCXGCCACACGACiSAGG 

0 h £ V H 0 Y " FEHLX3QXVC C 3 

ACGCCCXXXAXGCGCXXXGCCGGGGXGGArt^ACXCCACXCXGGCCAGCXGCAXCCICACC 
7 1 «> X 7 ———•♦• — ————— — — — + — — — — — — — — — ^ — — — — — — — — — *■. — — — — — — — — — + — — — — — — — — — + — 

XGCGGGAAAX ACGCGA AACGGCCCCACCXXXXGAGGXGAGACCGGXCGACGX AGGAGXGG 



76516 
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_pehreagvensxlasc ilx 

ACCCCCGACCXCAGCXCCGAGXGGGACGIGACCCAGGCCCXCXATAGSCaCCXGGGGCGC 

— — + — + - + 4-- » 77036 

TGGGGGCXGGAGXCGAGGCXCACCCXGCACXGGGXCCGGGAGAX AICCGXGGACCCCGCG 

TPDLS3EUDVTQALTRHLGR 

TACCXCXXXCAGCGAGCCGGGGXGGGXGXAGGGGXGACGGGGGCXGGCCA6GAXGGGAAA 

+ +. + •+■ -i- 77096 

AXGGAGAAAGXCGCXCGGCCCCACCCACAXCCCCACXGCCCCCGACCGGTCCXACCCXTX 



Y 



LPQRAGVGVGVXGAGQDGK 



C ACAXC AGCCXCCXGAIGAGGAXGAXCAACAGCCACGXGo AGX ACC ACAACXAXGGCXGC 

7T0 o 7 «- * * * + + 77156 

GXGXAGXCGGAGGACXACXCCXACXAGXXGXCGGXGCACCXCAXGGXGXXGAXACCGACG 



HISLLMRHIMSHVEYHNYQ 



_ 



AAGAGGCCGGXCAGXGXGGCGGCCXACAXGG AGCCCXGGCACAGCCAGAXXXXCAAGTXX 

77157 — + + + + + + 77.1- 

TXCXCCGGCCAGXCACACCGCCGGAXGXACCXCGGGACCGXGXCGGXCXAAAAGXXCAAA 

KRPV3VAAYMEPUHSQ I E K E 

™nn aaaCGAAGCXGCCGGAGAACCACGAGAGGXGCCCGGGCAXCXXX ACGGGGCXCXXX 
- - - - + + — — -i 77276 

AACCXXXGCXXCGACGGCCXCXXGGXGCXCXCCACGGGCCCGXAGAAAXGCCCCGAGAAA 
LETKLPENHERCPG IEXGLE 

GXCCCCGAGCXCXXCXXCAAGCTXXXXAGGGACACGCCCXGGXCGGACXGGXACCXGXXX 

^ j + + /7oo6 



"» •? " *7 v _ — 4- 



CAGGGGCXCGAGAAGAAGXXCGAAAAAXCCCXGXGCGGGACCAGCCXGACCAXGGACAAA 
VPELEEKLERDXPW3DWYLE 
GACCCC AAGGACGCCGGGGrtCCXGGAGAGGCXCX ACGGGG AGGAGXXXG AGCGCG AGXAC _ 

i _ .___-»--__—. — — —. — — — t — — - * ~r* — — — — — — j 

cxggggttccxgcggccccxggaccxcxccgagaxgccccxccxcaaacxcgcgcxcaxg 
dpkdagdlerlygeeeerey 

T A X C GGC T G G X G A C A G C G G G C A A G X X X X G X G G G CG G G I C X C C A X C A A G X CC C X G AX GX X C ^ 

. — — — I — — — / 4 _ 6 

— — — — — — — 1 — ^ 

AI AGCCGACCACXGXCGCCCGXXC AAAACACCCGCCCAGAGGX AGXXC AGGG ACIAC AAG 

•ffcLYXAGKECGR V 3 I K 3 L r. 5 
XCXAXCGXCAACXGCGCCGXCAAGGCCGGCAGCCCCTXCAXCCXXXXGAAGGAGGCCXGC 

, — — + — — — . — — — ^ — — — — — — — — > — + — — — — — / / _r 1 



' ' AGAX AGCAGXXGACGCGGC AGXXCCGGCCGXCGGGGAAGX AGGAAAACXXCCXCCGGACG 



i 

■"r.-Tprfir.AiT, 



I V N Z 



V K A G S P E ILLKEAC 



AACGCCCACXXXXGGCGCGACCXGCAGGGCGAGGCCAXGAACGCCGCCAACCXGXGCGCC ____ 

7-7517 — - * + + + * + 

XXGCGGGXGAAAACCGCGCXGG ACGXCCCGCXCCGGX ACXXGCGGCGGXXGGACACGCGG 

NAHEWROLQGEArtMAANLCA 

r^GGXGCXGCAGCCCXCGAGGAHGXCXGXGGCCACCXGCAATCXGGCCAACAXCXGCCXC 
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775' 
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— i 77636 



CXCCACGACGXCGGGAGCXCCXXCAGACACCGGXGGACGXXAGACCGGXXGXAGACGGAG 
EVLQPSRKSVAXCNLArilCL 

CCGCGCXGCCXGGX6AAIGCGCCXCXGGCGGXGCSGGCACAGCGGGCCGACACGCAGGGG 

,.6w/ ~ ^ " -r -r + 77G3G 

GGCGCGACGGmCCACXTACGCGGAGmCCGCC ACGCCCGTGXCGCCCGGCTGIGCGXCCCC 



PSCLVNAPLAVRAQRADTQ 



G 



776^7 



G A X G AACXCCTGCXGGCCCXCCCTCGwCTCXC AGXC ACCCTACCXGGAG AoGGGGCAGTC 

— _ — ™— — ————— + — ——— — — — — ———.» — ______ .^______ 

CXACXXGAGGACGACCGGGAGGGmGCXGAGAGXCAGXGGGAIGGACCXCXCCCCCGXCAG 
DELLLALPRLSVXLPGEGAV 

GGXGAXGGAIXCXCGCXAGCCCGCCXCAGAGAXGCCACCCAGXGXGCCACCXXXGXGGXG 

/ / / ^7 + + + + -r 7731o 

CCACXACCXAAGAGCGAXCGGGCGGAGICXCXACGGXGGGXCACACGGXGGAAACACCAC 

G D G E 3 L A R L R D A X Q C A X F U U 

GCCXGCXCCAXXCXXCAGGGAXCCCCCACXXAXGAXXCCAGGGAXAXGGCCTCCAIGGGC 
778X7 + + + + . + 77g7 . 

CGG ACGAGGXAAGAAGXCCCXAGGGGGXGAAXACXAAGGICCCX AX ACCGGAGGX ACCCG 

ACS ILQGSPTYDSRDHASrtG 

CXCGGGGXGCAGGGCCXGGCCoAXGXCXXXGCGGACCXGGGCTGGCrfGXACACXGACCCX 
77377 + + + + + + 77936 

GAGCCCCACGXCCCGGACCGGCXACAGAAACGCCXGG ACCCGACCGTCflXGXGACTGGGA 

LGVQGLADVEADLGUQiTD? 

CCCXCXCGCXCGXXAAACAAGGAAAXAXTCGAAC^XAXGXACXXXACGGCCCXCTGCACC 
77*37 + + + -r + ?7?9g 

GGG AGAGCGAGCAAXXXGXTCCXXT A X ArtGCXTGXAX ACAXGAArtlGCCGGG AG^CGXGG 
P S R S L N K E IcEriMYEXALCX 



AGXAGXCXGAXXGGACTTCaCmCCAGGAAGAXXXXTCCGGGXXXCAAaCAGaGCAAGXAX 

XCAXCAGACXAACCXGAAGXGXGGXCCXXCXAAAAAGGCCCAAAGXXXGXCXCGXXCAIA 

33LIGLHTRKIEPGEKQSK * . 

GCCGGGGGGXGGXXXC ACXGGC ACGw XXGGGCAGG ArtC AG ACCXXXCX A XXCCCAGGGAA 
73057 + — — — + ~ - + — -4 + h 7311b 

CGGCCCCCCACCAAAGXG«CCGTGCXAACCCGXCCXXGXCXGGAArtGAXAAGGGXCCCXX 

AGGWEHUHOUaGXDLSIPRE 

AXXXGGXCXCGCCXCXCXGArtCGCAXXGXGAGGGAXGGGCXXXXCAAXXCACAGXTXAXC 
7o 1 17 + + + + + - "317* 

I AAACCAGAGCGGAG AGACXXGCGXAACACXCCCXACCCGAAAAGXXAAGXGXC AAAIAG 

IW3RL3ER IVR0GLEM3QE I 

GCCCXGriXGCCCACCXCAGGCTGX'jUCC»^GGXGMCGGGCXGXXCGGAC.3CCIXCXAi:CCC 
73177 + + + + + 78_3b 

CGGGACXACGGGXGGAGXCCGACaCGGGXCCACXGCCCGACAAGCCXGCoGAAGAXGGGG 
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- 7 - 

ALMPXSGCAQVXGCSDAEYP 
XXCXAXGCCAAXGCGXCCACCAAGGXCACCAACAAGGAGGAGGCCCXXAGGCCAAACCGG 

S *° ? aagaxacggxxacgcaggxggxxccagxggxxgxxccxccxccgggaaxccggxxxggcc 

EfAMASXKVXNKEEALRPNR 

XCTXXXXGGCGXCAXGXGCGXCXGGAIGACAGGGAAGCXIXGAAXCXXGICGGGGGCCGX 

3297 + ■*■ + + v + 

AGAAAAACCGCAGIACACGCAGACCXACXGXCCCXXCGAAACXXAGAACAGCCCCCGGCA 

SEURHURLDDREALNLVGGR 

GTCXCCXGCCXCCCGGAGGCXCXGCGGCAGCGCXACCXGCGXXXCCAAACGGCCXXXGAX 

3357 + + + + + + 73416 

CAGAGGACGGAGGGCCXCCGAGACGCCGXCGCGAIGGACGCAAAGGXXXGCCGGAAACXA 

V3CLPEALRQRYLREQXAHC 

XACAACCAGGAGGACCXGAXXCAGAXGXCCCGGGACAGGGCCCCCXXXGXGGACCAGAGC 

r8417 - — + + + + + *a-/o 

AXGXXGGXCCXCCXGGACXAAGXCXACAGGGCCCXGXCCCGGGGGAAACACCXGGXCXCG 



Y N Q 



EDL IQttSRDRAPEVDQ 



CAAXCXCACAGCCXGXXXXXGCGXGAGGAAGAIGCCGCGCGGGCCAGCACGCXAGCCAAC 

73477 + + * + + + 78536 

GXXAGAGXGXCGGACAAAAACGCACXCCXXCXACGGCGCGCCCGGXCGXGCGAXCGGXXG 



Q 



HSLELREEDAARASXLAN 



CXACXGGXGCGCAGCXACGAGCXGGGCCXGAAGACXAXCAXGXACXAIXGXCGCAIXGAG 

L . j. + — h 73596 

78537 *• + + 

GAXGACCACGCGXCGAXGCXCGACCCGGACXXCXGAXAGXACAXGAXAACAGCGXAACXC 

LLVRSYELGLKXIMYYCRIS 

A AGGCCGCCG AXCXGGGGGXGAIGGAGXGXAAGGCCAGCGCGGCXCXGXCGGXGCCGC3G 

735*7 -r + -r + + * >3bib 

XXCCGGCGGCXAGACCCCCACXACCXCACAXXCCGGXCGCGCCGAGACAGCCACGGCGCC 

KAADLGUttECKASAALSVPR 
gaggaacagaaigagcgg AGXCCCGCXGAGCAGAXGCCGCCXCGXCCCAIGGAACCQGCG _ _ 

___...— i_ — _ _ — — ___ — + — — — — — — —— — ■*• — — — — — — — — — — — — — — — y O / 1 

7 ^ ^ ^? — — — — «y«— — — — — — T — ■ 

CXCCXXGXCXXACXCGCCXCAGGGCGACICGICXACGGCGGAGCAGGGXACCIXGGCCGC 
EEQNE ftSPAEQMPPRPrtEPA 

CAGGXXGCGGGGCCGGXXGACAXCAXGAGCA AGGGCCCAGGGGAGGGACCAGGXGGGXGG _ 

. _ _ _ — — ————————+—— / o / / '3 

* , 371 > *t ___+. — — ——— — — — — +.——— — — — h — — — — — — — — — T 

GXCCAACGCCCCGGCCAACXGXAGXACXCGXXCCCGGGXCCCCXCCCXGGXCCACCCACC 

Q V A G P V D IttSKGPGEGPGGU 
XGXGXGCCCGGGGGAXXGGAAGIGXGCX AXAA6X ACCGICAGCXCXXCXC AGAGGAIG AX 

_ . — — — l. — — — — — — — — ' .jOiJO 

7377 — -v- __ — — — — — + — — ——— ———— -v - — — -~ - * — — — — — -r 

ACrtCACGGGCCCCCXAACCXXCACACGAXAIXCAXGGCAGXCGAGArtGAGXCXCCTACXA 

C v p ti r, L £ V C i K YRQLS3E0 0 
CXGXTGG AGACXGACGGXXXXrtCXGAACGAGCCXGXG AAXCXXGCCAA I AhmCGXXXAXX 
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3837 + + + + + 73396 

GACAACCTCTGACIGCCAAAAIGACTTGCTCGGACaCITAGAACGGITATTTGCAAATAA 

LLETDGETERACESCQ* 

GCCATGTCCAAGTTGTTG 

3897 -i- -r 73914 

CGGTACAGGT7CAACAAC 
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rt 



R G R E X 




Q M P V A R Y >3 Q P E I « 'J R L H Q 0 0 



GGAGAGGCAAACAXACAGGAGGAAAGGCXAIAIGAGCX^^ 

1320 CCTCICCGXTTGTATGTCCTCCXTTCCGA 

G E A N I Q E E R L t E L L S D P R 3 



.-^ i~* 

1879 




QLDPGPLIAEN 




N 14 




N ;3 £ Q G £ H L G - E 5 A L £ A 




rm 1 1 i.i rvi ta* re i - _* ^ ^ 




e i Q a 



•j L R V £ 




„ 6R QEGHEWHRS8PSBRQBQ 

GCCArCAAICACCXXGXCCieXXXliAUAACGCCC^^^ 
2240 caeXA6XXAai6QAACAGGAC^ 
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A INHLVLEDNALRKYD3GQV 

GCSGCGGGCXXCCAGAGGGCCCXXCXGGIGGCCGGGCCAGAGACCGCXGACACGAGGCCG 

300 + + «•■ ■*■ + + 2359 

CGCCGCCCGAAGGXCXCCCGGGAAGACCACCGGCCCGGXCXCXGGCGACXGXGCXCCGGC 

AAGEGRALLVAGPEXADXRP 

GACCXCCGCAAGCXGAAIGAGXGGGXGXXXGGXGGCAGGGCXGCXGGXGGCAG ACAGCXG 

360 + + + + * 2419 

CTGGAGGCGTTCGACTTACTCACCCACAAACCACCGTCCCGACGACCACCGTCTGTCGAC 

DLRKLNEWVEGGRAAGGRQL 

GCCGACGAGCXAAAGAXCGXGXCCGCGCXGCGAGACACXXACXCGGGCCACXTGGXCCXX 

420 + ■*■ + + + 247? 

CGGCXGCXCGAXXXCXAGCACAGGCGCGACGCXCXGXGAAIGAGCCCGGXGA ACCAGGAA 



DELKIVSALRDXYSGHLVL 



CAGCCCACGGAGACCCXXGACACAIGGAAGGXGXIGAGCAGGGACACACGA ACCGCXCAI 

430 + + + + + + 2539 

ijXCGGGXGCCXCXGGGAACXGXGIACCXXCCACAACXCGXCCCXGXGXGCXXGGCGAGXA 

QPXEXLDXWKV^LSRDXRXAH 

AGXXXGGAGCACGGAXXCAXXCAXGCCGCGGGGACCAXCCAGGCCAACXGCCCACAGCXG 

540 + + + + + + 2599 

XCAAACCXCGXGCCXAAGXAAGXACGGCGCCCCXGGXAGGXCCGGXXGACGGGXGICGAC 

5LEHGE I HAAGX IQANCPQL 

XIXAXGAGACGCCAGCACCCCGGCCXCXXXCCCXXCGXXAAXGCAAXAGCAICAXCGCXG 

:G00 + + + + **" H 2659 

AAAX ACXCXGCGGXCGXGGGGCCGGAGAAAGGGAAGCAAXXACGXXAXC3XAGXAGCGAC 

HMRRQHPGLFPEVNAIA3SL 

GGCXGGTrtCiACCAGACCGCCACCGGCCCCGGAGCAGAXGCCAGGGCSGCGGCCCGGCGC 

J 6 o 0* + + + + ■*■ -/i*J 

CCGACCAXGAXGGXCXGGCGGIGGCCGGGGCCXCGTCXACGGXCCCGCCGCCGGGCCGCG 

GtJYYQXAXGPGADARAAARR 

CAACAGGCCXXXCAGACCAGGGCGGCGGCXGAaXGCCaTGCCAAAAGCGGGGXGCCGGXC 



720 * + + + ■*■ • 

ijXXGXCCGGAAAGXCXGGXCCCGCCGCCGACXXACGGTACGG TXXXCGCCCCAC3GCCAG 

L> Q A E Q X R A A A £ C H A K 3 G V P » 

GXGGCCGGCXXCXACAGGACCAXCA ACGCCACGCXC A AGGGAGG AG AGGGCCXACAGCCC 

730 * -f + + + 

CACCGGCCGAAGAXGXCCXGGXAGXXGCGGXGCGAGXXCCCXCCXCXCCCGGAXGXCoGG 

V A G E Y R X I N A X L K G G E G L Q P 

ACXAXGXXXAACGGGG AGCXGGGGGCCA IC AAGC ACC AGGCACXXG ACACXGXGAGGXAI 

340 + + + + + 

IGAXACAAAXXGCCCCXCGACCCCCGGXAGXXCGXGGXCCGXGAACXGXGACACXCCAIA 

XnEMGELGA IKHGALOTVRY 
GACXACGGCCACXAXCXCAXAAXGXXGGGGCCAXXCCAGCCAXGGAGCGGACTGACGGCC 



1 1 



-1 



2399 
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+ 2959 



>CJ 00 * -r * -r 

CXGAXGCCGGXGAXAGAGXAXXACAACCCCGGXAAGGXCGGXACCXCGCCXGACTGCCGG 

DYGHYL ItlLGPFQPWSGuXA 

CCXCCGXGCCCCX ACGCCGAAAGXXCAXGGGC ACAGGCGGCCGXGCAGACGGCCCXCGAG 

>g &0 * + + + 301? 

GGAGGCrtCGGGGAXGCGGCXXXCAAGXACCCGXGXCCGCCGGCACGXCXGCCGGQAGCXC 

? p C P Y A E 3 S U A Q A A V Q X A L E 

CXGXXCICGGCCCXGXACCCGGCCCCGXGCAXCXCGGGCXACGCGCGCCCCCCGGGCCCC 

3020 * + + + + + 3079 

GACAAGAGCCGGGACAXGGGCCGGGGCACGX AGAGCCCGAXGCGCGCGGGGGGCCCGGGG 

LE 3ALYPAPCISGYARPPGP 

AGXGCXGXGAXCGAGCAXCXGGGGXCCCXAGXXCCAAAGGGGGGXCXGCXGXXGXXXCXG 



3030 + + + + * 

ICACGACACXAGCXCGXAGACCCCAGGGAXCAAGGXXXCCCCCCAGACGACAACAAAGAC 



3139 



A V IEHLG SLVPKGGL.LLEL 



XCXCACCXACCGGAXGAXGXXAAGGACGGGCXCGGAGAAAXGGGGCCGGCCAGGGCCACG 

31A0 + + + + + 3139 

AGAGXGGAXGGCCXACTACAAXXCCXGCCCGAGCCXCXXXACCCCGGCCGGXCCCGGXGC 

3HLPDDVKDGLGEHGPARAT 

SGACCXGGAAXGCAGCAGXXXGXCAGCAGCXACXXCCXCAACCCCGCCXGIXCCAACGXC 

3200 - + * ° 

C2XGGACCXTACGXCGXCAAACAGXCGXCGAXGAAGGAGXXGGGGCGGACAAGGXXGCAG 

QPQMQQEVSSYBLKPACSNV 
TTCAXX ACAGXGAGGCAGCGAGGGGAGAAGAXCA ACGGCCGXACCGXCCXCCAAGC6CIC 



^259 



'30 + * + ■** *•* • 

AhGXAAXGXCACXCCGXCGCXCCCCXCXXCXAGXXGCCGGCAXGGCAGGAGGXXCGCGAG 



3319 



E I X V r: Q BGEKINGRXVLQAL 
GGACGCGCAXGCGAXAXGGCAGGCXGCCAGCACXAXGXGCXGGGCXCCACGGXXCCCCXC 

--\ r-i 

— . -\ — _ — — — — — + — — — — — — 4- — — — . — — . — — — . — — — — — — ..JO/ 

" CCXGCGCGXACGCXAXACCGXCCGACGGXCGXGAXACrtCGACCCGAGGXGCCAAGGGGAG 

GRACDHAGCQHYVLG3XV PL 

GGIGGACXCAACTXXGXCAACGACCXGijCGXCCCCGGXXXCCACCGCCGAGAXGAXGGAX 
33<>0 ^ — . | — _—— — ..— .4. — — — — — — — — — 4- — \ V —— — — — — — ino . 

C C A C C T G A G X X G A A AC A G X X G C X G G A C C G C A G G G G C C A A A GG X G G C G G C X C X A C X A C C X A 

GGLNEVND lASPVSXAEMHO 

GATXXCXCXCCCXXCXXCACCGXGGAGXXXCCCCCGAXXCAAGAGGAGGGCGCAAGXXCX 

3440 + + + + + 3499 

CTAAAGAGAGGGAAGAAGXGGCACCXCAAAGGGGGCXAAGIXCXCCXCCCGCGXXCAAGA 

DE3PEEXVEEPP IQEEGA33 
CCiVjXACCCXXAiiAXGXGGACGMGAGCAXGGHCAlCTCXCCGXCXXACGAGXXGCCCXGG 

, . 3 c i.1 

>i- - t ^^, > — _ — — — — — — — 4> — — — — — — — — — — — — — — — — — — + — — — — — — — *- —+—————— — — — — — — — ijw - 

GGCCAXGGGA«XCXACACCXGCXCXCGXACCXGXAG»GAGGCAGAAXGCXCAACGGGACC 
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P ? L D V D E 5 H B I S P 3 X S u P W 

CTCTCGCTGGAGXCAXGCCXCACAAGCAXCCXGXCfiCACCCCaCCGXGGGAAGCAAGGAG 

3560 + + + + ZZZZZ 

GAGAGCGACCTCAGTACGGAGXGXTCoXAGGACaGXGXGGGGXGGCACCCXXCGXTCClC 

L3LE3CLTS ILSHPTVQSKc 

C ACXXGGXC AGGCACACGGAC AGGGXCAGCG6AGGACGCGX6GCAC AGC A6CCCGGG6X A 

3620 * + * * T 

GTGAACCAGTCCGTGTGCCTGTCCCAGTCGCCTCCTGCGCACCGTGTCGTCGGGCCCCaT 



367 



HLVRHXDRySQGRVAQQPQy 

GGTCCCCTGGACCXGCCGCTGGCGGACXACGCCXXCGTXGCCCACAGXCAGGICTGGACC 

3630 + + * **" **" + 

CCAGGGGACCXGGACGGCGACCGCCXGAXGCGGAAGCAACGGuXGXCAGXCCAGACCXGG 



GPLDLPLADYAHVAH3QVW 



AGGCCCGGXGGGGCXCCXCCCXTGCCCTAXCGXACCXGGGAXCGA AIGACAGAGAAGCXG 
3740 + + + + + ^ T 3/ " " 

xccgggccaccccgaggagggaacgggatagcaxggacccxagcttacxgtcxcxxcgac 

rp g g a p p l p y rtw drh x *e k l 
ctxgxcxccgcaaaacccggcggagagaacgxtaaggtxxcaggtaccgxgatxacaxtg 

3300 + + "** + """*** ~ + 385? 

GAACAGAGGCGXTXTGGGCCGCCXCTCXTGCAAXXCCAAAGTCCAXGGCACXAAXGXAAC 

LVSAKPGGENVKV3GXV I T L 

GGAGAACAGGGGTACAAAGXGTCGTXGGATCTGA6GG AGGGAACCAGGCTGGC AAXGGCT ^ ^ 

+ ■»■ + + + "*" 

CCXCTXGXCCCCAXGIXXCACAGCAACCXAGACXCCCXCCCTTGGTCCGACCGTTACCGA 

Q E 0 G Y K V S L D L R E G T ?/ L A fl A 
GAGGCGCXGCXGAACGCAGCAXGXGCCCCAATCTTGGAXCCGGAaGACGXCTT ^ ^ ^ 

*3 20 — — J. — — — — — — — — — —— — — — — — — — — —— — .J . 

CTCCGCGACGACXTGCGTCGXACACGGGGTXAGAaCCXAGGCuXXCTGCAGhACGAGTGG 



3360 + 



ALLNAfiCAP ildpedvll 



A. 



CXGCAXCXACACCTGGAXCCGCGCCGGGCAGACAACXCGGCCGXGATGGAGGCXAXGalij ^ 

3 ..-Vg 0 ^ + + + f ^ 

GACGXAGAXGXGGACCXAGGCGCGGCCCGXCXGTTGaGCCGGCACTACCXCCGAXACXGC 

LHLHLDPRRADN'SAVHEAriT 

GCGGCGAGXGACXACGCGCGXGGCCXGGGCGXGAAGCXGACCTXTGGCXCGGCCXCCXGC 

, _ _ — — — >b — — —————— — -Jk » • 

4040 + + **• * * -~ 

CGCCGCXC ACXGAXGCGCGC ACCGG ACCCGCACTXCGACTGGAAACCGAGCCGij AGGAtG 

A A 3 0 Y A R G L G V K I X E G S A 3 C 
CCCGAGhCCGGCTCGTCCGCCTCCAACTXCAXGACCGXGGXGGCCXCXGTCXCCGCCCCA ^ 

Al0 o + * + + + * 4iJt 

G G G CX G X G G C C G A GC A G GC G G A G u I X G A A G X A C X G G C A C C A C C G G A G A C A G A G G C G a G b X 



l> E t 6 3 3 A 



i r> t' h r V V A £ l i o 



GGGaAXXCXCGGGXCCXCXGaXCACGCCAGXGCXTCAGmAGaCGGGCaGXCXCCXGATX 
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4 2i-r 



4160 4- -r -i- + 

CCCCXXAAGAGCCCAGGAGACX AGXGCGGXCACGAAGXCXXCXGCCCGXCAGAGGACXAA 



nSGPL ITPVLQKTGSLL I 



GCGGXGeGXXGCGGGGAXGGCAAGAXCCAGGGAGGGXCGCXGXXXGAGCAGCXCXXXAGC 

4220 + ■+■ + -i 427? 

CGCCACGCAACGCCCCXACCGXXCXAGGXCCCXCCCAGCGACAA ACXCGXCGAGAArilCG 



ti 



VRCGDGKIQGG3LFEQLF3 



GACGXGGCCACGACCCCACGGGCACCCGAGGCGXXGXCXCXGAAGAAXCXCXXCCGGGCA 

4230 +* -r + +■ + + 4339 

CXGCACCGGXGCXGGGGXGCCCGXGGGCXCCGCAACAGAGACXXCX7AGAGAAGGCCCGX 

DVAXXPRAPEAL3LKNLFRA 

GTCCAGCAGC7GGTCAAGAGCGGCATCGTGCTGTCAGGGCATGACATCAGCGACGGGGGC 

4340 + + + + f- 4399 

CAGGXCGXC3ACCAGXXCXCGCCGXAGCACGACAGXCCCGXACXGXAGXCGCXGCCCCC3 

VQQLVK3G IVUSGHD ISDGG 

CXGGXGACCXGCCXGGXGGAGAXGGCCCXGGCCGGGCAGCGGGGAGXGACCAXCACXAIG 

4400 * + h + +- 4459 

GACCACXGGACGGACCACCXCXACCGGGACCGGCCCGXCGCCCCXCACXGGXAGXGAXAC 

LVXCLMEhALAGQRGVX IXh 

CCGGXGGCCXCCGACTACCXCCCGGAGAXGXXXGCAGAGCACCCCGGCCXGGXGXXXGAG 

4460 +■ + + +* +■ + 4519 

GGCCACCGGAGGCXGAXGGAGGGCCXCXACAAACGXCXCGXGGGGCCGGACCACAAACXC 

PVASDYLPEHFAEHPGLVFE 

GXGGAGGAGCGCAGCGXGGGXGAGGXGCIGCAG ACCCXGCGCXCCAXGAACAXGX ACCCG 

4520 + + + 4579 

CACCXCCX2GCGXCGCACCCACXCCACGACGXCXGGGAC3CGAGGX ACXXGXACAXGGGC 



t 



t E R SVG E V L d T LRSMN'rtYP 



GCnGXCCXCGGXCGAGXGGGCG«GCArtGGXCCAGAXCAArtTGXXXG«i3GXGCAGCAC3GC 

4530 -r -r +• -r 4639 

CGXCAGGAGCCAGCXCACCCGCXCGXXCCAGGXCXAGXXXACAArtCTCCACGXCGXGCCG 

A u L G R x > G E Q G P D Q rt F £ V Q H G 

CCAGAGACGGTGXXGCGCCAGTCGCXGCGCCXGCTGCXGGGAACCXGGXCAICCXTXGCC 

4640 +■ 1 1 **- ■** 4699 

GGTCXCTGCCAC AACGCGGXC AGCGACGCGG ACG ACGACCCTXGG ACC AGT AGGr^AACGG 

P E X V L ft Q S L R L L L G X U 3 3 F A 

AGCG AGCAGXACGrtGXGCCXGCGACCAG AXCGG AXX AACCGGXCC AXGC ACGXGXCCGAC 

4700 + 1 1 +• + _— - — + 4759 

XCGCXCGXCAXGCXCACGGACGCXGGXCXAGCCXAAXXGGCCAGGXACGXGCACAGGCXG 

3EQY ECLRPDR INRSHHVSB 
XAi^UiiCXAX^AiJO^^ijC^CXuJjC^GXCXCCCCGXXGACAGGArtrtGHAXCXCAGCCCACGC 

4, b*-/ +. — — — — — — — — — .». — — —._ — —.»... — — — — — — — — — — — — — —— -r — — — '■ -iui - 

^XGCCGrtXAXXGCXXCGXGrtCCaXCAGAiiGGGCArtCXGICCXXXCXXAGAGXCGGGXGCG 
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*G*NEALAVSPLT.3KN 



i- -3 r r* 



CGGXXGGXGACAGAGCCXGACCCACGAXGXCAGGXGGCCGXGCXAXGCGCCCCGGGCACC 
43 20 * + + + * + 4g79 

GCCAACCACXGXCXCGGACXGGGXGCXACAGXCC ACCGGCACGAIACGCGGGGCCCGIGG 
R l-VTEPDPRCQVAVLCAPGT 



AGGGGCCAXGAAftGCCrCCTGGCGGCCTTCACGAATGCCGGATGCCTGTGCCGACGGGTG 
4380 -i -h ^ +. — . 4. ^ 

TCCCCGGTACTTTCGGAGGACCGCCGGAAGTGCTTACGGCCTACGGACACGGCTGCCCAC 
RGHESLLAAETNAGCLCRRV 



4939 



4940 + 



XXCXXXCGCGAGGXXAGGGACAACACGXXCCXCGACAAGXACGXGGGXCXGGCCAXCGGA 



4999 



AAGA AAGCGCTCCAATCCCTGTTGTGCAAGGAGCTGTTCAXGCACCCAGACCGGTAGCCT 
FHREVRDMTELDKYVGLA IG 

GGAGXICATGGGGCCAGGGACTCXGCCCIGGCAGGCCGXGCCACCGTGGCGCTGATTAAI 
sooo * * + + + + 5059 

CCXCaAGXACCCCGGXCCCXGAGhCGGGACCGXCCGGCACGGXGGCACCGCGACXAATXA 
G V H G A R D.. 5 A L A G R A X V A L. * I H 



5060 * 



CGXXXCCCCGCCCXGCGXGACGCXAXXCXAAAGXXCCXCAACAGGCCAGAXACGXXCXCG 



■+ 5119 



GCAAAGGGGCGGGACGCACXGCGAXAAGAXXXCAAGGAGXXGXCCGGXCXAXGC AAGAGC 
RFPALRDA ILKFLNRPDTFS 

GXGGCCTXGGGGGAGCXGGGGGXGCAaGXXXXGGCXGGCCXGGGGGCCGXGGGGXCAACA 
120 f + + + + + 5179 

CACCGGAACCCCCXCGACCCCCACGXXCAAAACCGACCGGACCCCCGGCACCCCaGXXGT 



y A L G E L G V Q y L A G L G A V G 



GAXAATCCACCCGCCCCTGGCGXGGAAGTTAAXG7CCAGAGAICACCTCTGAIXCTGGCC 

130 + + -r * 52-,-. 

CrAXTAGGXGGuCGGGGACCGCACCTXCAAriACAGGICXCTAGXGGAGACTArtGACCGG 



DNPPAPG^EVNVQRSPLIL 



CCCAhCGCCXCTGGCAXGXXXGaGXCCCGCTGGCTGAACAIXAGCAXCCCGGCGmCCACC 
5240 - * + -r + + 

GGGXXGCGGAGACCGXACAAACXCAGGGCGACCGACXTGTAAXCGXAGGGCCGCXGGXGG 
PNft5»3dE£3RWLNISIPAXX 



5299 



300 



AGCTCTGTCAX-jCXGCGXGGCCXCCGGGGCXGCGXCCXGCCXXGTXGGGXGCAAGGCXCG 



TCGAGAC AGXACGACGCACCGGAGGCCCCGACGCAGGACGGAACAACCCACGXTCCGaGC 

SSVKLRGLRGCYLPCUVQGS 

TGCCXGGGCCXGCAAIXXACXAACCXCGGGAXGCCAXAXGXXXXGCAGAAXGCCCACCAG 
5360 + + +. + + 5-il9 

ACGGACCCGGACGXXAAAXGAXXGGAGCCCXACGGXATACAAAACGXCXTACGGGXGGXC 
CLijLUeTNLGnP * V L 0 N A H U 
AXCGCOXGCCACXXCCAUAGCAAXGGCACGGAXGCCXGGCGCXXXGCXAXGAAXXAXCCA 
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5420 f * + "~ + 

XAGCGGACGGXGAAGGXGXCGXXACCGXGCCXACGGACCGCGAAACGAXACXXAAXAGGX 

IACHEHSNGXDAWRFA MN YF 

GAAACCCCACGGAGCAGGGCA ACAXXGCAGGGCXCXGXXCACGCGAXGGXCGXCAXCXG 



430 



arc: t o 

w w w •* 

XCTXXGGGGXGCCXCGXCCCGXXGXAACGXCCCGAGACAAGXGCGCXACChGCAGTAG^C 



b ftNPXEQQNIAQLCSRDQRHu 

GCTCXCCTGXGXGACCCCXCACXXXGXACAGACTXXXGGCAAXGGGAGCACAI-CCCCCC 

5540 + + + + 

CGAGAGGACACACXGGGGAGXGAAACAXGXCXGAAAACCGXX ACCCXCGXGXAAGGGGGG 

b ALLCDPSLCXDEWQWEHIPP 

GCCXXXGGGCACCCCACGGGGXGCXCCCCCXGGACACXXAXGXXXCAAGCAGCXCACCXA 



5600 + + + + + J 

CGGAAACCCGXGGGGXGCCCCACGAGGGGGACCXGXGAAXACAAAGXXCGXCGAGXGGAI 



659 



A 



EQHPTGCSPUTLMF-QAAHL 



XGGXCACXCAGGCACGGXCGCCCCXCCGAGXGACCAGXCAC 

5660 + + + + + 5700 

ACCAGXGAGXCCGXGCCAGCGGGGAGGCXCACXGGXCAGXG 

USLRHGRPSEA 
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AICCAlCriTAXXGACAAXXAXCAAAAAACCACCXTAXXTCCAAACTTTAAXATiCTTCG 

43300 *• + + -r -t- -t- 43359 

XAGSTAGAAAXA ACTGXIAAXAGXXXXTTGGXGGAAXAAAGGXXXG AAAX7ATAAG AAGC 

d * * F V V K N G F K L I R K 

XACCGGCGCCACCXCTXCAAXXAIAXAGXGXCCGTAaXGGAXGGGGGCGXGGGXCXGXTT 

43360 + * + + + + 43419 

AIGGCCGCGGXGGAGAAGTTAAXAIATCACAGGCAIXACCXACCCCCGCACCCAGACAAA 

d V P A 0 E E IIYHGYHIPAHXQK 

GACAGACAXAAACTCATCGAXGAGXGCCCGGGAGGAGGCXGAGAGTGCGGGGAAXGCCTC 

43420 + **■ + + + + 43479 

CXGTCXGXAXXXGAGTAGCTACXCACGGGCCCXCCXCCGACXCXCACGCCCCTXACGGAG 

d USHF5BILARSSASLAPEAS 

CXGCAGAAAGCXGCAGGGCTGC7CCAGAAACA15gTCAGXGCCAGCA AXCACX AC AAACXG 

4*2430 + -c + + + 43539 

GACGXCTTTC3ACGXCCCGACGAGGXCIXXGXGCAGXCACGGXCGXXAGXGAXGTXXGAC 

d QLF5CPQELF VBXGA IVVEQ 

CACCTCXGXGXXGCTGGXGGCTGGGXGCCCXCCAAGXCGCXGGCIGXACXCGXTGACCAX 
43540 +" •*■ + + -r 43599 

GXGG AGACACAACGACCACCGACCCACGGGAGGXTCAGCGACCGACAXGAGC AACXGGX A 

d VEXN3XAPHGGL RQSYENVM 

♦3XXGX AGAGTCCCCXGXTGXXGCGC AGAAGCXCCTCCXXGXXGAAAAAXGCCCGGC AGGG 

43600 ■»• + + + + 43659 

CaaCAXCXCAGGGGACAACAACGCoXCXXCGAGGaGGhACAACXXTXXACGGGCCGXCCC 

d r4 Y L G RMNRLLEEKNFFARCP 

GCTGX AG A GGCCCGGQ A CGGCCGXCXGGCG AX AGGrtGGAGXTGX AC AXG A TGXC AC 3C AG 
43660 + 4371v 

CGACAXCXCCGGGCCCTGCCGGCAGACC0CXAXCCXCCXCAACAXG1ACXACAGXGGGXC 

d SYLGPVATQRYSSrtYMIDGL 

AGAACCCAGCXGAGAXGCCCAGGGAXXCACAGiGCXCCGGXAXXCAXAGGCGGCAXCCGG 
43720 + + + + + 43779 

XCXXGGGXCGACXCXACGGGXCCCXAAGXGXCACGAGGCCAIAAGXAXCCGCCGXAGGCC 



3 *j 



LQ 3AUPMVXSR YE YAADP 



GCGAGAAXGGXCAXAGAXGAGCCCCXCGGCAACCXCCXGAXTGXAGXXXXCACAGGAGAC 

43730 + ■*• + + 4333*-? 

CGCXCXTACCAGXAXCXACTCGGGGAGCCGXXGGAGGACXAACAXCAAAAGXGXCCXCTG 

R 3 H 0 Y ILGEAVEQNYNEC3V 

CACACAGGCriGCCCGXCCCCXXGbHijAGTXGGACrrXXGAAAAXAAGCCACGXCXGCCGX 

■43G40 -r + + + 43*399 

3XGXGXCCGCCGGGCAGGGGAACCXCICAACCXGA«AACXTXXAXXCGGXGCAGACGGCA 
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V C A A R G ft P 3 * S K a £ r A y D a t 

GACCGGXGXTACGAIAATCXCACftGGXGGCCXGCXGGCCGXGGCAGAGXCCXGGAGCXCC 

43900 +• + * + 4 _ . J 

CXGGCCACAAXGCTAXXAGAGXGXCCACCGGACGACCGGCACCGXC7CAGGACCXCGAGG 

V p T V I IECTAQQGHCLGPAG 

AXXAACAXXAGXCAIACCXGCCAGGXAXGXCCTGGGGXCCCGAAGCAGCGXCCCAXXGCG 

4"3*G0 «- + + 440 iV 

X AAXXGXA AICAGXAXGGACGGXCC:^T ACAGGACCCCAGGGCTTCGXCGCAGGGX ArtC GC 

NVNTMGALYTRP&aLLTGrtS 

CXGAGCGCCCACCXTGGCCXXGAXGXAGXC AIXG ACTXGCXGGXXGC2AA AGGCCXC3GC 

44020 + + + + * - 407? 

GACTCGCGGGTGGAACCGGAACXACAXCAGTAACXGAACGACCAACGGXXXCCGGAGCCG 

Q ft G V K A K lYDNVQQNGFAEA 

cggaaagacgcxaaagaagxcxxgggxgxggaxacccaxgxcagxagxgaxggccgccac 

44080 + + +" ""■ + 4413'$ 

gccxxtcxgcgaxxtcixcagaacccacaccxaigggxacagicaxcacxaccggcggxg 
psv3eedqxh ighdxt i a a v 

CCXGGCCGGAGXCAXGGXCGAGCXAXAACXAaGCCCGGXGXCGhXGGAGGCCaXCTCGXG 

j 

44140 + + -** + 

ggaccggccxcagxaccagCxcgaxaxxgaxxcgggccacagcxacctccggxagagcac 

RAPXJ1X3SYSLGXD I 3 A rt E K 

AXGCACCXCAAAGGTXACCGCGICCACCCXGGCCXC2CGGCGGCXAACAXXXGGGGXCCC 

44200 + + + ^ * 

TACGXGGAGXXTCCAAXGGCGCAGGTGGGACCGGAGGGCCGCCGA IXGXAhACCCCAGGG 

HVEETVADVRAERR3VMPXG 

A A X G A A C A X G G A X G XT G AG G CC C X G G A G CX A A A C A A X A X G X X 7 X C AG A G A G G A X C X C A X C 

442*0 * + * 

X 7 A C X X G X A C C X A C A A C X C C G G G A C 2 X C G A X X X G X X .-i X A C A ft ft ft GT C 7 C X C C 7 AG A G X ft »j 

1 7 n 3 X 3 A R 3 3 ? L I N E 3 L I E 0 

GGXCCXGACCACGGXCAXGGCCACCCCXGGGXGGA XC rXGAGCXXGGCCXGGGCAAXAXA 

44320 *■ + + * 

C^AGGACXGGXGCCAGXACCGGXGGGGACCCaCCXAGaaCXCGaaCCGGACCCGXTAXAX 

7 r v x rt a y G P h : k l k a q a l : 

GGCC AXGGGGG AC AXCXXG A XGXGC AXGGLGGXC A TXC: AC XG A IXG A A AC GAGGG A AGO 

44330 + + + + T ' ft--*. 

CC:jGrACCCCCXGXAijAACXACACGXACCGCCAGTAAGGTGACXAACXXXGCTCCCXX : -C 

A rt P 3 M K I H A T rt G 3 I 5 V L 3 P 

AAGACAXTCGGCCGCGXAXXXGCCCAXGGGCGAijCGGXGCCACXCCCGGXACXCXGCAAA 

44440 t- + - i * t- i.i.. 

7 7 C X G X A A G C C G GC G C A X A A A C 0 G G X A C C C GC X C G C 2 A C : iG X G A G G G C C A X G A G AC G X X X 

L 2 A A t i; G M P 3 K H W i r; i a n 

•*j A G C X ' j C X C X G G CC o G XX • j h h i j G CTX C C a C G G C C 2 G C 7 G C X G A G * j A I T G 2 G C A " A A C A A A 
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. ^ + 44559 

44500 + **- * + . _ 

CICGACGAGACCGGCCAACTXCCGAAGGXGCCGGGCGACGACXCCXAACGCGIAIXGaax 

LQEPRNEAEVARQQPNRrtVE 
GGXGGCAACAICCXGGXGCAXGGXGGCAGCCriCXCGCGGGXCCCCGIAAAACAXAXGGAA 

j — -v — -4b 1 ? 

44560 + + + * 

CCACCGXXGXA6GACCACGTACCACCGXCGGXGAGCGCCCAGGGGCAXXXXGXAXACCXX 

d XAVDQHMXAAVRPDGYEMHE 

AGGAAIGGCGXGAAAGAGACACXGGGIGACGGCCCGGGXCCXCXCGGAGAAGGCAAAGGC 

446^0 + + * * + * 4467? 

TCCXXACCGCACIIXCXCXGTGACCCACXGCCGGGCCCAGGAGAGCCXCXXCCGXTXCCG 

o piAHELCQXVARXRESEAEA 

CACCAGCCCGXXCACCAAAACAGXCXGCXCXGXCCGCXXGXCGGCGGGAXXCGGGGCCAG 

44630 + + + + + + 447 * 9 

GIGGXCGGGCAAGXGGXXXXGXCAGACGAGACAGGCGAACAGCCGCCCXAAGCCCCGGXC 

tl V L,a,MVLVTQETRKDA?N?AL 

CXGCXGCQXAACGXCAXXGXCCACCGACACACGCACGGCACGGGXGAAAGXGGGGCAGGI 

, « —_——————+——— — — — — — 44/ -9 

4 4740 —————— ——■+■ — — — •— — — — — +— — — — — — — — — ^— — ——————— t* t - * ' •* » 

GACGACGCAXXGCAGXAACAGGXGGCXGXGXGCGIGCCGXGCCCACXXXCACCCCGXCCA 
d QQlVDNDVSVRVA'RXEXPCX 

CAXGAAXGAGGCGCXGAGGXCCCXGAXCAXGCCCACGGXGGGGCGGAGGXCGGAGAXCXC 

44300 + + + * ^ 

GXACXXACXCCGCGACXCCAGGGACXAGXACGGGXGCCACCCCGCCXCCAGCCXCTAGAG 

d MESASLDRIHGVXPRLDSI5 

CAGCAGAICCCXGAGCGXCCCAXXCICCAAAIXGXCGAGGAXGXCCXCGXCCCXGGXAAA 

, dJ^lM 

44660 + + + * t.-i- 

GTCGXCXAGGGACXCGCAGGGX AAGAGGXXXAAC AGCXCCXAC AGG AGC AGGGACCATXX 

a LLORLXGNELMDLIDSDRXE 

AXGGXGGCXGAAGGCXGGCCCGXXGXAGGCCAGGGXCXGGGCCACGXGCXGAAAGXCCAC 

— — — — — — — — .1 *1 ^1 7 

a 4 q _ h — — *- — — — ! — *■ -i-i ^ .' - 

IACCACCGACXXCCGACCGGGCAACAXCCSGXCCCAGACCCGGXGCACGACXXXCAGQXG 

t1 H H S E A P G M f A L I Q A V H Q E D V 

CCCG AGGCCGCACAXGXGGGCAXXGGXGCAGGXXGGGAGGA A AACGX AGX A A AAGAXCXX 

^4?'o0 + + + + 

GGGCXCCGGCGXGXACACCCGXAACCACGXCCAACCCXCCXXXXGCAXCAXXXXCXAGAA 

d GLGCHHANICXPLEVtYEIK 

XXCC AGC ACAXCCGC AIGCCCCXC AICXACAXAAGGGCCX AGGXGCAG ACGG AA AXCGXG 



45040 > + — — + 

AAGGXCGXGX AGGCGX ACGGGGAGXAGAXGX AXXCCCGGAXCC ACGXCXGCCXXX AGCAC 

ELVOAHGEDVYPGLHLREDrt 
GXCGXGGXCXCCiiXXrtACCCGGXAGCCGXACArtGGCCACAriAXXGGGCAGCCAXCXCAXC 
4 J ' ° C AGCACCAG AGGC AAXXGGGCCAXCG GCAXGXXCCGGXGXTX AACCCGXCoGX AG AGX AG 



4S0M 
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DH0G«VfiY3TLAV = QAArtSO 

CAXGXXXCCAACCCXCXCAATAAaCXGGGGCGCGGCCAGGGXGXCAGCGXAAACCTCAXX 
5160 + + -r + + + 4221° 

GXACAAAGGXXGGGAGAGTTAXTTGACCCCGCGCCGGXCCCACAGXCGCAIXTGGA6XAA 
(1 N G V R E IruPAALTDAYVEN 

TCCGaXAAXAATCXGGGGGGCCCGGXCACXAaCGGXGAGAAGAXGGGXGaaaAXGXCTGX 
521:0 + + v + . 45^' ? 9 

AGGCXAXXAXXAGACCCCCCGGGCCAGXGAXXGCCACXCXXCXACCCACXXXXACAGACA 
3 IIIQPftRDSVTLLHTE IDT 

GXAGGCCACCGGGGGGAGCAGGXTAGGGXCCAGGAGAGCGCAGACAXACXGACCCACGCX 
5230 + -i- -i- + 4. ^ 45339 

CAXCCGGXGGCCCCCCTCGXCCAATCCCAGoTCCXCXCGCGXCXGXATGACTGGGTGCGA 
YAUPPLLNPDLLACVYQGVS 

CXCATCCCCCACAACAXCXGACCCGGCCAGGCGCATCAGGGCCXGCXCTAGGGCXAXAAG 
5340 * , 4 ^ ^ + 453^9 

GAGXAGGGGGXGXTGTAGACX3GGCCGGTCCGCGXAGXCCCGGACGAGAXCCCGAXAXXC 
E 3 '2 V V D 3 . G A L R M L A Q E L A I L 

XXCCCCAXAGAXXXTTCXAXACATGGAAXAGGCCXCCXXGGAGAIGGCGXTATXXCCCAG 
54 00 + + h, + ^ + 4545g 

AAGGGGXAXC T A AAAAGAT ATGX ACCXXATCCGGAGGAACCXCXACCGCAfiXA A AGGGTC 
E 3 i' IKRYH3YAEKS I A N N G L 

GTGGCGGCAGAXGAACXXGAXCAXGGAAAAGCTGTXCACAAAGGCAAGCCTCCCXGACCG 
S460 ^ + + + + 45519 

CACC3CC3XCXACXXGAACXAGXACCXXXXCGACAAGXGXXXCCGXXCGGAGGGACXGGC 
H 3 C 1 F K I rt 3 H 3 N V F A L R G S R 

rTCCCAGTAGGTGTrGATGCACmGGGACACCAHAGGC^CGTTCATGACAAACTTTTCCTC 
— " , ^ —— — t.—— ——■ 455"^• :^, 

AAGGGXCAXCCACAACXACoXGXCCCXGiGGITXCCGTGCAAGXACTGXXXGArtAAGGAG 
E * ' X N X C L 3 V L ? V H h V F K E E 

AAACCCGXGGATCaXAGCCXCGACXACGXAGAAGAAGGCTGGATAGGCAGXGXCAXAGGC 
ww«0 + -t- * + 45639 

XXTGGGCACCXAGXAXCGGAGCXGATGCAXCTXCXXCCGACCXAXCCGXCACAGXAXCCG 
F G H I rt A E V V Y F c A P i A X D Y A 

AGTAXCCXGCACAGXCXCAMXAriCGGCCXGAXCCACCACGXGGGCCAGAGArGIGGCGGX 
5640 f + + + 4. ^ 456?9 

XCAXAGGACGXGXCAGAGXXAXTGCCGGACXAGGXGGXGCACCCGGXCXCXACACCGCCA 
ID Q V X E IVAQDVVHALSTAT 

CXC A A ACXGCXGCCCCCGGGCCXCXXGGAATGCAGCXGGGGCCAGGGGAGXCGGCAGGXX 
5700 * * 4- 4- + + 45759 

G A G X XT G A (3 G A CG G G G G C CC G G A G A A C CX X A C G X C G AC C C CG G X C C CC X C A G C CG X C C A A 
^ z *-i Q * fe A E Q F « A P A L P X P L iM 
ACCCACCAXXrtGCCGGXGCACAGCCCTGXGCCXGGCCCXCXCCCCGGCAXCCCXGCCAAX 
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45760 + -r +■ + -r 4531 :• 

xgggxggxaaxcggccacgxgxcgggacacggacc3ggagaggggccgxagggacg6xxa 
gvhlrhvarhraregadrg: 

SXAAAXAXCAIAAAGGGGGXGCAGCXCCaGCCGCAGCAGGXCAIAAXXGGACGGGXGGAG 

CAIXXAXAGXAXXXCCCCCACGXCGAGGXCGGCGXCGXCCAGXAXXAftCCXGCCCACCXC 

Y IDYLPHLELRLLDYNSPHL 

GAAGXCXXCGGXGGGCAGCCCGCACXXGAGAGCXAXAXCXGTCACGGGGGCXGCAXACXX 

45880 ■** + + + + 45939 

CXXCAGAAGCCACCCGXCGGGCGXGAACXCXCGAIAXAGACAGXGCCCCCGACGXAXGAA 

EDEXPLGCKLA I D X V P A A Y K 

GXXAXCAXAGAACXCGXCCACAAXAACAAGCACAXXCAXGXGAXXGGGCCXCCXGXGXXG 
45940 + + + + + + 45999 

caaxagxaicxxgagcaggxgxxaxxgxxcgxgiaagxacacxaacccggaggacacaac 
ndyfedv ivlvnmhnprrhq 

CA6G6AGXAGGXCXCGCGCCXGXCXCGCGGGGCCGGGGCCGCGXXGAGGCXGXXXA66GX 

46000 + + + •*■ 46059 

GXCCCXCAXCCAGAGCGCGGACAGAGCGCCCCGGCCCCGGCGCAfiCXCCGACAAAXCCCA 

LSYXERRDR PAPAAN. L3NL-I 

aXGGGCGGGXGXGXGGAGXCGGGGGXGACAGAGmACCXXG AGAGCAXTCXGXAGGXTAAA 

46060 + + + + + - -.611 9 

TACCCGCCCACACACCXCAGCCCCCACXGXCXCXXGGAACXCXCGXAAGaCAXCCAaTXX 

HAPTHLRPHCLVKLANQLbE 

CGCGAGGAGAAGGXXAXXCXXGXXXACGAXCCrtXGCCXCCACCGGXAGCXGCTGXGToGG 
46120 + + + -!■ 461^9 

GCGCXCCXCrXCCAAXAAGAACAAATGCXAGGIACGGAGGrGGCCAIC.lHCGHCACACCC 
ALLLMNKNV XUASVPLGGXr 



GXXGXCCAGCAXXXXGAXGGCG6CGGAGGXCGX6XACXTGGGaXXG6GCATAAmCi4G6CC 

46130 * 1- + -r + 

CA AC AGGXCGX A A A ACXACCGCCGCCXCC AGC AC A XG A ACCCXA ACCCGX AXXXGXCCoG 

N D L h IrtASXXYKPNPriELG 

C ACXGGGAA^XAGXAGCXGX ACXGC AXXCXXCX6XXGA66G6GXAXGG66ACXGA6X6XC 

46240 + + + h + + 46299 

GXGACCCXXXAXCAXCGACAXGACGXAAGAAGACAACXCCCCCAXACCCCXGACXCACAG 



V P E Y Y 3 Y Q n ft R N L P Y P 3 G X D 

AXXGXACAXCXXXXGCAGGCXXXCCACGGCCACCGCGXGGXXGCCCAGCXXGrtTGACGGC 

46300 + + + + + 46359 

T AACAIGXAGAAAACGXCCGAAAGGXGCCGGXGGCGCACCAACGGGXCGArtCXACXGCCG 

ft ii K G L S E V rt V A H N G L K £ V « 

'j6CXUmG«TCGGCACCC;K;iGGCXGAXCCX 



16360 <*■ ■ 



46419 



CCGACXCTAGCCGXnG0CCCCGMCXAGGAijCXGiiGGMCGCC:3GXGXCGGCCGXCCAGXCX 



60/64 



0173254 



- 21 - 



1-1 O X 



I ? V R P Q D E V G A A V A P L D 



CXXGGXGCXXCCGGCXXXXXCCGGXGAGXCCACGAXCCXAGCCAXGAAAXGCXCAAACGX 
464-0 + + + + + + 46479 

GAACCACGAAGGCCGAAAAAGGCCACXCAGGXGCXA6GAXC3GXACXXXACGAGXXXGCA 



KXSGAKEPSDUIRAriEHS 



c 



ACGCAXCACGCGCCCGXAGCXCACGGCAGXGaCCAGGXXCXCCCCCCGXACCACAAAAGA 
46430 + + + + ■+• 46339 

XGCGXAGXGCGCGGGCAXCGAGXGCCGXCACXGGXCC AAGAGGGGGGCAXGGXGXXTXCX 



RttVRGYSVAXVLNEGRYVr 



AGCAXAGCXCGAGGGCCCCAXAAXCXGGXXGXCGGCCXCCXCACCCAGGAAGGXCAAGAG 
46540 + + + -r + + 46399 

XCGXAXCGAGCXCCCGGGGXAXXAGACCAACAGCCGGAGGAGXGGGXCCXXCCAGXXCXC 



Y 3 3 P G M IQNDAEEGLEXLL 



CXGGCGCAGAACGXXGXCGGXGACAAIAAACACCCCCCCCACXGGCXCXCCCCCCXXGGC 
46600 + + -r + + + 4665? 

'jh-wo'juoio* i'JunMOnijuortLiOi ini 1 1 oiouooU'JU'j iijtiLuonuHijijou'j'jnnuoo 

Q R L V N D X V IEVG6VPEG 5 K A 

GGICGXGTAGGXACXGACCCCCXXGAGCACGCXCXCCCCGGACACGGCCGCTACCATCXC 

46660 + + + + + + 46719 

CCAGCACAXCCAXGACXGGGGGAACXCGXGCG AGAGGGGCCXGXGCCGGCGAXGGXAGAG 

XXYX3VGKLVSEGSVAAVME 

AGAGAGACGGCXXCGCACGXACXGAGAAAACCCGGAGCCCAXGXXCXCGGCCCGGXCCAG 
46720 + + + + + + 46779 

TCXCXCXGCCGAAGCGXGCAXGACXCXXXXGGGCCXCGGGXACAAGAGCCGGGCCAGGXC 
SLRSR'v' YQSEGSGrtNEARDL 

.BAaGAAGGAGXGCTCCAGCAoAXGCCTCXXGAACAXGGCAhXGAGGXCAGACXTGAChGX 
1- -2- 468 2 9 

CXXCXXCCXCaCGAGGXCGICXACG 6AGAACXXGXACCGXXACXCCA6XCXGAACX6XCA 



ErSHELLHRKEttfl I L D 3 K 'J X 

CXXGGaGmwCCCCCXCXCAGXGAaGGXGGGAICCGCCaGGGXCXGCaGGAXAAACAXGGG 

46340 +■ + + + + 4639? 

GaaCCXCXXGGGGGAGaGXCACXXCCACCCXAGGCGGXCCCAGACGXCCXAIXXGXACCC 

K 3 E 6 R E X E X P D A L X Q L I £ rt P 

AGGGGCAXGGCGAAGCXXCACACXCAGGACGGXGXXAAXGAGGCCCCXCXCCAGGGCAXC 
46900 ■*■ — — — — — — + — - — + — — .+.— — — — — — — 46959 

XCCCCSXACCGCXXCG A AGXGXG A6XC CXGC C AC A AXXACXCCGGGG AG AG6XCCCGX AG 

PAHRLKVSLWTN ILGRELA0 

GACCCCAAaCXGXAGGGCCGAGGCCACGGXCXXGACAGCCCCCACGXACXCXGCGXaCXC 

46960 ~ * ^ + t 47019 

CXGGGGXXXGACAXCCCGGCXCCGGXGCCAGAaCXGXCGGGGGXGCAXGhGACGCATGAG 



I 



■ 2 c .1 L A 3 A V X K V A G y Y E A r c 



uACCiinGGXCTCouabAtACXAXGCAijGHTCTCCAGrtXCCAijCAXijQACAiiXXCCAXXXC 
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020 + + 4 '° 7 

CXGGCC2C AGAGCCCCXAXGAI ACGXCCX AGAGGXCXAGGXCGX ACCXGXCAAGGXAAAG 

V p I S P 13 HL lELDLrtSLEr*. E 

C G X A C X A A XGX G G X G X X X G X G G C A A X X X XX G A C C AC A A X G A A X GX C C G C XG C X X G CX GG G 

080 ■•■ - + ^ T 4713* 

GCAXGAXXaCaCCACA A ACACCGXXAAAAACXGGXGXX ACXX ACAGGCGACGAACGACCC 

I = I H H K H C N K V V I E X R Q K S ? 

rCXCCXXCCGXCCCCGXGAGCAATGGXGGGGACGGAGAXXCGAAAXXGAAlCXXGCCAXC 

■ 140 + + - + + + 4719' 

AGAGGAAGGCAGGGGCaCXCGXXACCACCCCXGCCXCXAAGCXXXAACXXAGAACGGXAG 

R S G D G H A X X P V 3 I R E Q I K G D 

CGXCAXACGACXCAGGXCXXXGhAXXCCGXGXXCACACAGGACACGGCCAGXGCCGXCXC 
7200 + + + + + + 4720 

gcagxaigcxgagxccagaaacxxaaggcacaagxgxgxccxgxgccggxcacggcagag 

tk3sldkfetnvcsvalate 

caggaagcgaacaxaxxggaxggcgxxcgxgxaghCCCCGagxagcaccxcaaactxgat 

_ — — — -4- !- 1 4 / Jl 

gxccxxcgcxxgxaxaaccxaccgcaagcacaxcxggggcxcaxcgxggagxxxgaacxa 



72S0 



L E 3 V t Q lANXYyGLuVEcK j. 
GCCCGCCTCXCTGGCaTC^XXGCCCACCAGCAGGXCAAAGCXAXGAAACAACCCCXCAGC 

* *^ 



320 ^ 



* « *^ 

4 .- o / 



CGGGCGGAGAGACCGXAGGA ACGGGXGGXCGXCC AGXXXCGAX ACXXXGXXGGGGAG.LLG 

GaEaADkGVLLDESHELGEA 

CGCXGACXGCCoCaGGXXCGAGaGCAGGXCGGCaXCCACCGXCAGAXAGGGGAAGGGICT 
— **7-£' 

3CoAC"nACGGC:rrCCAAGCXCXCGXCCAGCCGXAGGXG.3CAGICXArCCCCTTCCCAGA 

A 3 -3 f; L 3 I- i. Li A 0 y T L t P E ? S 

G X X X X C C A C A C C C X £ A X X X G A G G C C A X G A C A C A ft G G X A A G A G G G A G A X G G G G G G A G G 



7440 + * * 

CArtHAGGXGTGGGAGIAHACXCCGGXACXGXGXXCCrtlXCXCCCTCIACCCCCCXCC 



474Vb 



r 
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SSSaSU of the 8-gal S : Pl 50 fusion proteins 
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Figure 31: 

Map of the pi 50 encoding region 
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